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TOPICS  IN 

QUEUEING  NETWORK  MODELING 


Preface 


Queueing  network  models  are  finding  growing  acceptance  as 
useful,  cost-effective  tools  for  evaluating  computer  system  per¬ 
formance.  This  report  is  a  collection  of  related  papers  on 
queueing  network  modeling  by  members  of  Project  SAM  in  the  Com¬ 
puter  Systems  Research  Group.  Of  the  seven  papers  included,  five 
have  been  externally-refereed  and  will  be  published  in  conference 
proceedings.  This  preface  highlights  the  results  of  each  paper. 

Recent  research  results  have  extended  the  usefulness  of 
queueing  network  models  by  allowing  them  to  handle  multiple  job 
classes,  general  service  time  distributions,  and  sophisticated 
scheduling  disciplines.  As  a  further  extension,  K.C.  Sevcik 
studies  priority  scheduling  disciplines  in  queueing  network  models 
in  the  lead  paper  beginning  on  page  1.  Priority  scheduling  models 
are  approximated  by  several  models  that  can  be  analyzed  by  effi¬ 
cient  computational  algorithms.  These  latter  models  vary  in  their 
complexity  and  accuracy,  and  provide  optimistic  and  pessimistic 
performance  bounds  on  the  priority  models.  The  concept  of  a 
"shadow  CPU"  introduced  in  this  paper  appears  to  be  useful  in  the 
context  of  more  general  priority  scheduling  disciplines. 

The  issue  of  approximation  is  treated  in  more  detail  in  the 
second  paper  beginning  on  page  19.  K.C.  Sevcik,  A. I.  Levy, 

S.K.  Tripathi,  and  J .  Zahorj  an  consider  decomposition,  the  process 
in  which  a  subnetwork  is  independently  analyzed  and  then  replaced 
in  the  overall  network  by  a  single  composite  server,  selected  to 
represent  the  interaction  of  the  subnetwork  with  the  rest  of  the 
system.  They  discuss  and  extend  techniques  that  have  been  proposed 
for  establishing  the  characteristics  of  the  composite  server.  In 
particular,  formulae  for  approximating  the  coefficient  of  variation 
of  transition  processes  (the  flow  of  jobs  among  servers)  are  esta¬ 
blished  . 

A  queueing  network  model  is  used  in  the  third  paper,  begin¬ 
ning  on  page  43,  to  study  the  influence  of  workload  on  interactive 
response  time  in  a  virtual  memory  system.  Using  open  and  closed 
queueing  network  models,  S.K.  Tripathi  and  K.C.  Sevcik  show  that 
the  upper  limit  on  multiprogramming  level  affects  both  system 
throughput  and  mean  response  time.  They  identify  multiprogramming 
limits  that  maximize  throughput  or  minimize  mean  response  time  in 
their  memory-constrained  model.  The  authors  also  discuss  the 
usefulness  of  CPU  utilization  as  a  performance  measure  in  an  open  system 
without  constant  backlog. 

In  the  fourth  paper,  beginning  on  page  59,  E.D.  Lazowska 
develops  a  new  technique  for  matching  general  service  time  distri¬ 
butions  in  central  server  queueing  network  models.  He  first  shows 
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that  the  traditional  approach,  attempting  to  match  several  higher 
moments  of  these  distributions,  is  not  only  insufficient,  but  also 
unnecessary  and  frequently  misleading.  The  magnitude  of  the 
error  is  demonstrated  using  a  model  of  the  University  of  Toronto 
Computer  Centre.  He  shows  that  the  effect  of  the  CPU  service  time 
distribution  in  central  server  models  is  fully  determined  by 
the  value  of  its  Laplace  transform  at  several  points.  This  result 
leads  to  methodologies  that  select  server  parameters  by  matching 
either  certain  percentiles  or  certain  Laplace  transform  values  of 
the  observed  service  time  distribution.  The  technique  is  demon¬ 
strated  on  several  examples. 

A  closed  queueing  network  model  also  serves  as  the  context 
for  investigating  rotational  position  sensing  disk  storage  systems 
in  the  fifth  paper  beginning  on  page  85.  J.  Zahorj  an  develops 
an  open  single  server  queue  model  to  represent  a  single  disk 
module.  This  single  queue  model  is  then  used  as  a  component  in  a 
closed  queueing  network  model,  and  an  approximate  solution  tech¬ 
nique  provides  performance  measurements.  The  predicted  server 
utilizations  and  queue  lengths  are  close  to  those  given  by  simula¬ 
tion.  The  network  model  is  also  used  to  compare  shortest-seek¬ 
time-first  scheduling  to  f irst-come-f irst-served  scheduling. 

The  sixth  paper,  beginning  on  page  107,  gives  an  overview 
of  the  QSOLVE  system,  an  automated  queueing  network  solution  system 
using  the  global  balance  method.  J.  Zahorjanand  A. I.  Levy  discuss 
the  functional  and  operational  capabilities  of  the  system,  and 
provide  user  experience.  QSOLVE  has  been  a  useful  research  tool 
for  the  investigations  by  Project  SAM  into  queueing  network 
modeling . 

The  final  paper,  beginning  on  page  111,  investigates  the 
ability  of  memory  management  policies  to  act  as  load  controllers 
in  a  multiprogrammed  virtual  memory  computer  system.  G.S.  Graham 
and  P.J.  Denning  use  a  simple  queueing  network  model  to  study 
memory  policies,  with  extensive  use  of  address  reference  strings 
from  actual  virtual  memory  programs  to  generate  parameter  values 
for  the  model.  They  conclude  that  the  Working  Set  algorithm  has 
several  performance  advantages  over  the  Page  Fault  Frequency  al¬ 
gorithm. 


Project  SAM,  in  the  Computer  Systems  Research  Group,  is  a 
continuing  research  project  investigating  the  theory  and  practice 
of  queueing  network  models.  The  work  of  Project  SAM  is  supported 
in  part  by  individual  operating  grants  from  the  National  Research 
Council  of  Canada. 
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PRIORITY  SCHEDULING  DISCIPLINES  IN 
QUEUEING  NETWORK  MODELS  OF  COMPUTER  SYSTEMS 


Kenneth  C.  Sevcik 


Queueing  network  models  are  finding  growing  acceptance  as  useful, 
cost-effective  tools  for  predicting  computer  system  performance 
under  various  hypothesized  changes  in  system  environment. 
Algorithms  for  efficiently  calculating  various  performance 
measures  have  been  devised  for  a  broad  range  of  queueing  network 
models.  This  range  includes  models  with  such  realistic  features 
of  computer  systems  as  distinct  classes  of  jobs,  arbitrary 
service  time  distributions,  and  scheduling  disciplines  such  as 
f irst-come-f ir st-ser ved,  last-come-f irst-served,  and  processor¬ 
sharing.  However,  other  features  of  computer  systems,  such  as 
priority  scheduling  disciplines,  cannot  be  represented  directly 
in  queueing  network  models  without  rendering  inapplicable  the 
algorithms  for  efficient  calculation  of  performance  measures.  In 
this  paper,  we  propose  techniques  for  studying  priority 
scheduling  disciplines  in  queueing  network  models  by  using 
approximations  based  on  models  that  can  be  treated  by  the 
efficient  analysis  algorithms.  We  are  able  to  establish  bounds 
which  are  quite  tight  in  many  circumstances  on  such  performance 
measures  as  device  utilization,  throughput,  and  average  response 
time. 


'*'This  paper  has  been  accepted  for  publication  in  the  Proceedings 
of  IFIP  Congress  ’77,  August  8-12,  1977,  Toronto,  Ontario,  Canada 
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1.  INTRODUCTION 


The  size  and  complexity  of  large  computer  systems  make 
difficult  to  predict  the  impact  on  performance  of  various  cha 
to  a  system  or  its  environment.  Managers  responsible  for  kee 
system  performance  at  a  high  level  need  techniques  for  predic 
system  performance  under  such  changes  as  new  hardware  co mpon 
(faster  cpu,  additional  data  channel,  etc.),  changes  in  soft 
strategies  (new  scheduling  disciplines,  different  placemen 
data  sets  with  respect  to  channels,  etc.),  and  changes 
workload  composition  (increased  proportion  of  interactive  w 
etc.).  Recently,  analytic  models  based  on  queueing  networks 
been  shown  in  several  cases  to  provide  predictive  abi 
comparable  to  that  of  detailed  simulation,  yet  with  a 
smaller  investment  in  money  and  time.  [1-4] 


A  queueing  network  model  represents  a  computer  system  as  a  se 
service  centers  interconnected  by  transition  paths.  Jobs 
represented  by  tokens  that  move  from  one  service  cente 
another  according  to  a  set  of  transition  probabilities.  At 
service  center,  a  job  awaits  its  turn  for  service,  receive 
amount  of  service  selected  from  a  given  service 
distribution,  and  then  proceeds  to  another  service  center, 

1  shows  the  two  queueing  network  models  that  will  be  used 
examples  in  this  paper.  The  service  time  distributions 
assumed  to  be  exponential.  The  mean  service  time  (or  inverse 
service  rate)  is  indicated  at  each  service  center.  There  are 
distinct  types  of  jobs  (class  A  and  class  B) .  The  probabili 
associated  with  each  transition  path  are  indicated  for 
classes.  Note  that  the  mean  service  time  at  the  service  ce 
representing  the  central  processing  unit  differs  for  the 
classes,  while  it  is  the  same  for  both  classes  at  all  cha 
servers. 
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past  few  years,  the  early  work  on  analysis  of  queueing 
[5,6]  has  been  substantially  extended.  Buzen  [7] 
an  easily  implemented  and  very  efficient  computational 
r  deriving  performance  measures  from  closed  queueing 
with  exponential  servers  and  a  single  class  of 
.  The  method  is  based  on  exploiting  the  "product  form" 
probabilities,  where  the  probability  of  a  given  system 
the  product  of  terms  each  of  which  relates  to  the 
s  holding  at  one  particular  service  center. 
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Chandy,  Muntz,  and  Palacios  [8]  demonstrated  that 
ore  general  of  queueing  networks  retain  the  property  of 
form  state  probabilities.  These  networks  can  include 
customer  classes,  each  defined  by  a  set  of  transition 
ties  and  a  mean  service  time  for  each  service  center, 
vice  time  distributions  need  not  be  exponential,  and 
and  departures  of  customers  to  and  from  the  network  are 
.  However,  to  retain  the  product  form  state  probability 
,  the  scheduling  discipline  at  each  service  center  must 
f irst-come-f irst-served  (FCFS) ,  preemptive  last-come- 
ved  (PLCFS) ,  processor-sharing  (PS) ,  or  no-queueing. 
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CASE  I.  Central  Server  Model  With  Three  Channels 


CASE  II.  Central  Server  Model  With  Five  Channels 


FIGURE  1.  QUEUEING  NETWORK  MODELS  EX.'VMINED 

(Service  Times  Are  Given  In'  Milliseconds)  ■ 
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(Processor-sharing  is  the  limiting  case  of  round-robin  scheduling 
as  guantum  length  shrinks  to  zero,  and  no-gueueing  is  the 
scheduling  discipline  in  the  case  where  one  job's  progress  at  the 
service  center  is  in  no  way  impeded  by  the  presence  of  other 
jobs.)  FCFS  service  centers  must  have  exponentially  distributed 
service  times. 

Generalizations  of  Euzen's  computational  method  can  be  used  to 
derive  performance  measures  efficiently  from  the  more  general 
networks  defined  in  [8].  A  number  of  such  generalized 
computational  methods  have  been  designed  [9-11],  and  software 
packages  based  on  such  methods  are  available.  [12-14]  For  a  given 
network  defined  by  its  structure,  transition  probabilities, 
service  time  distributions,  and  numbers  of  customers  in  each 
class,  these  software  packages  produce,  for  each  class,  service 
center  utilizations,  mean  queue  lengths,  throughput,  and  mean 
response  time.  For  the  computational  results  presented  in  this 
paper,  a  special  program  based  on  the  computational  algorithm 
described  by  Chandy,  Herzog  and  V!oo  [9]  was  used. 


Queueing  networks  that  do  not  satisfy  the  constraints  that 
guarantee  a  product  form  for  the  state  probabilities  can  be 
analyzed  to  obtain  performance  measures,  but  much  more 
computational  effort  is  required  in  general.  The  most  general 
solution  method  is  the  global  balance  technique,  which  involves 
equating  the  rate  of  passage  into  each  system  state  with  the  rate 
of  passage  out  of  that  state.  This  can  result  in  a  single  linear 
equation  for  each  state  of  the  system.  The  system  of 
simultaneous  linear  equations  can  be  solved  to  determine  the 
steady-state  probability  of  each  system  state.  Since  the  size  of 
the  system  state  space  grows  exponentially  with  the  number  of 
service  centers,  jobs,  and  classes,  the  derivation  and  solution 
of  the  set  of  global  balance  equations  is  feasible  only  in 
queueing  networks  involving  very  few  service  centers,  jobs,  and 
classes. 


The  pervasiveness  with  which  priority  scheduling  is  used  in 
computer  systems  makes  it  highly  desirable  to  provide  an 
efficient  means  of  deriving  performance  measures  from  queueing 
networks  that  involve  priority  scheduling.  In  this  paper,  we 
describe  and  evaluate  several  techniques  for  efficiently 
investigating  priority  scheduling  in  queueing  networks  without 
resorting  to  the  global  balance  equation  technique.  We  use 
several  related  models,  each  resembling  the  priority  scheduling 
model  while  retaining  the  "product  form"  of  system  state 
probabilities. 


2.  THE  PRIOBITY  SCHEDULING  MODEL 

As  in  most  analytic  modelling  studies,  we  will  use  a  model  of  a 
computer  system  that  contains  very  little  of  the  complexity  of  an 
actual  system.  We  will,  in  fact,  assume  a  constant 
multiprogramming  level  in  each  job  class  and  exponentially 
distributed  service  times  at  all  service  centers  for  all  job 
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classes.  Such  assumptions  do  not  typically  hold  in  actual 
systems.  While  treatment  of  errors  introduced  by  such 
assumptions  is  of  great  importance,  we  will  not  study  it  in  this 
paper.  We  take  a  queueing  network  model  involving  priority 
scheduling  to  be  our  exact  objective,  and  we  will  suggest  and 
evaluate  several  ways  of  approximating  it  with  other  queueing 
network  models  not  involving  priorities. 


Consider  a  computer  system  that  supports  an  interactive  system 
while  also  executing  batch  jobs.  Many  such  systems  favor  the 
interactive  tasks  in  cpu  scheduling  by  giving  them  full 
preemptive  priority  over  the  batch  jobs.  Yet,  at  the  io 
channels,  all  requests  are  served  in  FCFS  order.  For  our 
purposes,  we  will  model  such  a  system  as  a  queueing  network  model 
with  a  single  central  processor  and  several  channel/device 
groups.  There  will  be  two  classes  of  jobs:  class  A  will 
represent  interactive  jobs,  and  class  B  will  represent  batch 
jobs.  The  scheduling  discipline  at  all  channels  will  be  FCFS 
while  the  discipline  at  the  central  processor  will  be  preemptive 
priority  in  favor  of  class  A  over  class  B.  That  is,  the  central 
processor  will  be  devoted  to  class  A  whenever  a  class  A  job  is 
available  for  service.  Within  each  class,  jobs  are  served  in 
FCFS  order.  The  average  service  times  at  the  channels  are  the 
same  for  both  classes,  but  the  average  cpu  burst  times  between  10 
requests  may  differ  between  the  classes.  For  each  class,  a  fixed 
set  of  transition  probabilities  governs  the  movement  of  jobs  to 
io  channels  after  receiving  a  burst  of  service  at  the  cpu.  Jobs 
always  return  to  the  cpu  after  receiving  service  at  an  io  channel 
(making  this  a  "central  server"  model). 


Our  purpose  in  using  this  model  will  be  to  examine  how 
presence  of  a  background  load  of  batch  jobs  affects 
throughput  and  response  times  of  interactive  jobs.  The  model 
also  be  used  to  predict  changes  in  performance  resulting 
such  changes  as  increasing  or  decreasing  the  b 

multiprogramming  level,  changing  hardware  components  (such 
inserting  a  faster  cpu  or  io  device),  etc. 
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Because  the  processing  of  interactive  jobs  requires  that  a 
certain  amount  of  memory  be  devoted  to  each  one,  the  number  of 
interactive  jobs  concurrently  active  is  much  less  than  the  total 
population  of  customers  of  interactive  terminals.  Thus,  for  both 
classes  of  customers,  we  will  assume  a  fixed  level  of 
multiprogramming,  and  make  a  "heavy-load"  assumption  stating  that 
each  job  that  completes  is  immediately  replaced  by  a 
statistically  identical  job.  This  is  represented  in  the  model  by 
a  transition  from  the  cpu  service  center  back  to  itself.  Thus, 
the  throughput  calculations  we  make  will  be  accurate,  but  the 
response  times  will  reflect  only  the  time  from  the  job's  first 
entry  into  the  cpu  queue  until  its  completion.  The  time  spent 
waiting  to  enter  the  multiprogramming  mix  is  ignored. 

A  modification  of  this  model  could  use  one  additional  service 
center  (located  on  the  transition  path  from  the  cpu  to  itself)  to 
represent  the  set  of  all  individual  user  terminals.  This 
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modified  model  would  better  represent  systems  in  which  there  is 
no  fixed  limit  on  the  maximum  number  of  interactive  jobs  that  can 
be  simultaneously  active.  A  model  incorporating  both  a  varying 
level  of  interactive  tasks  and  an  upper  bound  on  the  number  that 
are  simultaneously  active  requires  more  complex  analysis 
techniques.  [11]  Other  approaches  to  modelling  priority 
disciplines  are  taken  by  Hitrani  [17],  Sauer  and  Chandy  [15], 
and  Reiser.  [18]  Mitrani  treats  a  two-server  model  (cpu  and  io) 
with  N  customers  having  N  necessarily  distinct  priorities. 
Preemptive  priority  scheduling  is  assumed  at  the  cpu  while  the  io 
server  can  use  either  preemptive  or  non-preemptive  head-of-the- 
line  priorities.  Mitrani  derives  processor  utilization  and 
response  times  by  class. 

Sauer  and  Chandy  reduce  the  complexity  of  the  model  by 
aggregating  classes  and  by  using  Norton's  Theorem  [9]  to 
aggregate  the  channel  service  centers.  Application  of  Norton's 
theorem  to  queueing  networks  not  in  local  balance  results  in  a 
model  with  only  approximately  the  same  performance 
characteristics  as  the  original  model.  After  aggregation  of 
classes  and  centers,  the  reduced  model  is  sufficiently  small  to 
be  solved  efficiently  even  though  it  is  not  in  local  balance. 


Reiser  briefly  treats  a  model  involving  two  priority  classes.  He 
suggests  a  decomposition  of  the  model  in  which  the  lower  priority 
class  is  effectively  served  by  a  processor  whose  capacity  is 
reduced  by  the  utilization  attained  by  high  priority  jobs.  The 
two  classes  compete  for  the  cpu,  but  they  each  have  their  own 
dedicated  terminals  and  io  devices. 


Below,  we  summarize  the  important  parameters  that  describe  our 
priority  scheduling  model. 


K  =  the  number  of  channel/device  groups 

Ni  =  the  number  of  class  i  jobs  concurrently  active 
for  i=a,b 

Mj  =  mean  service  time  at  channel  j  for  j=2,...,K 
Mil  =  mean  cpu  burst  length  for  class  i  jobs  for  i=a,b 
Pij  =  pi;obability  of  a  class  i  job  going  to  channel  j 

after  completing  a  burst  of  cpu  service  for  i=a,b 
and  j=2, . . . , K 

PiO  =  the  probability  of  a  class  i  job  completing  after 
each  cpu  burst  for  i=a,b 

Performance  measures  of  interest  include: 

Uij  =  utilization  of  device  j  by  class  i  jobs 
Qij  =  average  number  of  class  i  jobs  at  device  j 
Ti  =  throughput  of  class  i  jobs  (the  rate  at  which 
they  are  completed) 

Ri  =  response  time  of  class  i  jobs  (the  expected  time  from 
entering  the  multiprogramming  mix  until  completion) 


3.  APPROXIMATION  TECHNIQUES 
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In  this  section,  we  present  a  sequence  of  approximations  using 
queueing  networks  with  product  form  state  probabilities  to  learn 
about  the  performance  of  the  preemptive  priority  scheduling  model 
of  the  previous  section.  Some  of  these  approximations  provide 
upper  and  lower  bounds  on  performance  that  are  frequently  quite 
tight.  In  all  cases,  we  will  be  concerned  with  system 
performance  as  seen  by  the  interactive  jobs.  An  analogous  set  of 
approximate  models  would  allow  us  to  study  performance  with 
respect  to  batch  jobs. 


A.  Single  class  models 


First,  we  consider  the  extent  to  which  models  involving  only  a 
single  class  of  customers  can  be  helpful.  Clearly,  an  optimistic 
bound  on  the  interactive  job  performance  can  be  obtained  by 
ignoring  the  presence  of  the  batch  jobs  (i.e.,  setting  Nb  to 
zero).  This  model,  having  only  a  single  class  of  customers,  is 
analyzable  using  Euzen’s  computational  algorithm.  [7]  (We  will 
refer  to  this  model  involving  no  class  B  customers  by  "NOE.") 


By  ignoring  the  presence  of  batch  jobs,  the  NOB  model  essentially 
gives  class  A  jobs  preemptive  priority  at  the  io  devices  as  well 
as  at  the  central  processor.  Since  the  service  discipline  at  the 
io  channels  is  actually  FCFS,  we  can  make  the  NOB  model  more 
realistic  by  expanding  class  A's  io  service  times  to  reflect  the 
delays  that  would  be  incurred  by  class  A  jobs  on  account  of  class 
B  jobs  at  the  io  service  centers.  Consider  doing  this  using  a 
sequence  of  two  single  job  class  models.  First,  analyze  a  model 
involving  only  the  batch  jobs  and  observe  the  utilizations  (Ubj) 
and  queue  lengths  (Qbj)  at  the  io  service  centers.  Second, 
analyze  a  model  involving  only  the  interactive  jobs  but  with 
their  io  service  times  expanded  to  reflect  congestion  due  to  the 
batch  jobs.  Two  specific  ways  of  expanding  the  mean  io  service 
times  are: 


Added  io  times  (AIO)  —  M j* =M j* (1 +Qb j) 

and 

Multiplied  io  times  (MIO)  —  M j '=M j/ ( 1-Ub j) 

In  the  former  case,  the  expected  time  that  a  class  A  job  must 
wait  during  the  service  of  class  E  jobs  is  added  to  its  service 
time.  In  the  latter  case,  the  service  time  is  multiplied  by  the 
inverse  of  the  channel  idle  time.  The  AIO  model  assumes  that 
queueing  of  batch  jobs  is  not  substantially  affected  by  the 
presence  of  class  A  jobs  and  that  the  class  A  jobs  have  little 
effect  on  one  another.  It  is  sometimes  optimistic  and  sometimes 
pessimistic  with  respect  to  class  A  performance.  The  MIO  model 
essentially  represents  class  B  jobs  having  preemptive  priority  at 
the  io  service  centers,  and  is  thus  a  pessimistic  bound  on  the 
performance  with  respect  to  the  interactive  jobs. 

The  accuracy  of  the  AIO  and  MIO  models  depends  on  the  assumption 
that  the  queueing  and  utilization  pattern  of  class  B  will  not  be 
altered  substantially  by  adding  the  class  A  load.  This  will  be 
true  only  if 
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(1)  class  B  alone,  at  its  normal  multiprogramming  level,  does 
not  nearly  saturate  any  device  used  by  class  A, 
and  (2)  the  average  number  of  class  A  jobs  active  (i.e. ,  not  at 
terminals)  at  any  particular  time  is  small. 
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Thus,  in  order  to  find  more  p 
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B.  "Multiple  class  models 


By  using  analysis  algorithms  for  multiple  job-clas 
can  obtain  better  approximations  of  the  priority 
model.  The  NOB  model  remains  as  the  easiest  optim 
but  we  will  establish  several  increasingly  tighter 
bounds.  One  pessimistic  bound  is  easily  obtained  by 
scheduling  at  the  central  processor  and  FCFS  scheduli 
channels.  This  model  is  called  "PSC"  for  proc 
central  processor. 


s  models,  we 
scheduling 
istic  bound, 
pessimi Stic 
employing  PS 
ng  at  the  io 
essor-shared 


In  order  to  eliminate  central  processor  contention  without 
completely  ignoring  class  E  jobs,  a  second  central  processor 
(called  the  "shadow  cpu")  can  be  provided  for  the  exclusive  use 
of  class  B  jobs.  Note  that  this  model  (called  "SHD")  is  no 
longer  a  central-server  model  since  class  A  jobs  use  one  cpu  and 
the  io  channels,  while  class  B  jobs  use  the  shadow  cpu  and  the  io 
channels.  Class  B  jobs  will  be  receiving  unrealistically  good 
service  at  their  cpu  since  they  don't  contend  with  class  A  jobs. 
Therefore,  class  B  jobs  will  congest  the  io  queues  more  than  they 
actually  would  in  the  priority  scheduling  model.  Thus,  the 
performance  of  the  SHD  model  is  still  a  pessimistic  bound  on  the 
performance  with  respect  to  class  A  jobs  in  the  priority 
scheduling  model. 


A  variation  of  the  SHD  model  involves  slowing  down  the  prog 
of  class  B  jobs  by  reducing  the  service  rate  of  the  shadow  cp 
reflect  the  cpu  utilization  of  class  A  jobs.  This  can  be  don 
multiplying  the  class  B  mean  service  time  at  the  shadow  cpu 
1/(1-Ua1)  where  Ual  is  the  utilization  of  the  cpu  by  cla 
jobs.  While  Dal  is  not  known  a  priori,  we  do  know  that  Ual 
bounded  above  by  the  cpu  utilization  of  class  A  jobs  wit 
class  B  jobs  present  (NOB  model)  and  below  by  the  cpu  utiliza 
of  class  A  jobs  when  the  cpu  discipline  is  processor  sharing 
model) .  Further,  an  iterative  approach  can  be  used  to  carry 
a  binary  search  on  that  interval  to  reach  a  self-consistent  c 
A  cpu  utilization,  that  is,  one  that  can  be  used  to  adjust 
class  B  shadow  cpu  service  time  and  will  result  in  itself  as 
class  A  utilization.  We  call  this  model  "UTA"  since  it  invo 
adjusting  the  assumed  cpu  utilization  of  class  A  to  be  s 
consistent.  This  model  differs  from  the  priority  schedu 
model  only  in  that  the  times  from  first  service  to  completio 
class  B  jobs  are  exponentially  distributed,  and  thus  the  t 
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between  departures  of  class  B 
distributed  as  long  as  the  shadow 
scheduling  model  the  variance 
will  be  higher  since  class  B  jobs 
busy  periods  at  the  cpu.  Sine 
time  distribution  decreases  th 
models,  class  B  will  progress  b 
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In  summary,  the  five  models  for  which  mean 
are  presented  in  the  fig.s  are: 


response  time  cu 


PSC  -  processor-shared  cpu 
SHD  -  shadow  cpu  for  class  B 

UTA  -»  shadow  cpu  with  reduced  service  rate  according 
to  class  A  utilization 

EXT  -  exact  solution  (from  global  balance  equations) 
NOE  -  no  class  B  jobs 


The  model  indicated  as  case  I  is  artificial.  The  branc 
probabilities  are  chosen  to  be  the  same  for  each  class  simply 
clarify  the  impact  of  priority  scheduling.  For  fixed  branc 
probabilities  and  io  service  times,  various  choices  of 
service  times  for  the  two  classes  lead  to  the  results  displ 
in  figures  2,  3  and  4. 


In  fig.  2,  the  mean  cpu  service  times  for  the  two  classes 
chosen  to  be  equal  and  to  approximately  balance  the  jobs  bet 
io  and  cpu  boundedness.  The  solid  lines  reflect  a  case  with 
batch  jobs.  Note  that  the  PSC  and  NOB  curves  with  Nb=4  diffe 
approximately  a  factor  of  two.  Improving  the  pessimistic  b 
by  using  the  shadow  cpu  model  (SHD)  substantially  reduces 
range  in  which  the  actual  expected  response  time  curve  must 
The  additional  improvement  of  the  UTA  model  is  less  substant 
although  the  degree  of  improvement  increases  noticeably  a 
increases. 
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When  there  are  only  two  batch  jobs  (dashed  lines) ,  the  relative 
positions  of  the  various  curves  remains  the  same  although  the 
magnitude  of  the  distances  between  them  is  approximately  halved. 
One  additional  curve  is  shown  in  fig.  2  for  Nb=2.  It  is  labelled 


Mean  Class  A  Response  Time  (seconds) 
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FIGURE  2.  BALANCED  LOAD 
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"EXT"  and  represents  the  exact  solution  as  obtained  by  using 
Gauss-Seidel  iteration  to  solve  the  global  balance  equations  for 
the  system  state  probabilities.  The  exact  solution  was  found  in 
many  different  situations  when  the  number  of  system  states  was 
under  1000.  This  was  done  to  check  the  reasonableness  of  the 
approximations,  and  to  obtain  intuition  about  how  EXT  was  related 
to  the  other  curves  in  the  cases  with  small  system  state  spaces. 
In  all  cases,  EXT  was  found  to  lie  between  UTA  and  NOB  according 
to  expectations.  Note  that  when  Nb=0,  all  curves  are  coincident 
with  the  NOB  curve  in  fig.  2. 


Fig.  3  displays  results  for  situations  with  class  B  jobs  cpu 
bound  and  class  A  jobs  either  cpu  bound  (case  la)  or  io  bound 
(case  le) .  When  the  interactive  jobs  are  io  bound,  the 
positioning  of  the  curves  is  similar  to  that  seen  in  fig.  2, 
except  that  UTA  is  only  a  slight  improvement  over  SHD.  However, 
when  the  interactive  jobs  are  cpu  bound,  the  approximations  are 
far  more  successful  at  constraining  the  range  in  which  the  exact 
result  may  lie  (indicated  by  the  shaded  areas  in  fig.  3). 


In  fig.  4,  results  for  io  bound  batch  jobs  are  shown.  When  the 
interactive  jobs  are  also  io  bound,  (case  Id) ,  the  PSC,  SHD  and 
UTA  models  are  virtually  identical  and  differ  substantially  from 
the  NOB  model.  This  is  the  case  in  which  the  approximations  are 
of  the  least  utility,  when  the  interactive  jobs  are  cpu  bound 
(case  Ib) ,  the  UTA  and  NOE  curves  constrain  the  exact  result  to 
an  increasing  extent  as  Na  increases. 

Finally,  results  for  case  II  are  shown  in  fig.  5.  The  transition 
probabilities  and  mean  service  times  for  case  II  were  obtained 
from  measurements  on  an  IBM  37C/158  running  the  TSO  interactive 
system  and  batch  jobs  under  the  OS/MVT  operating  system.  Thus, 
case  II  is  intended  to  be  indicative  of  a  "typical"  system 
environment.  For  batch  multiprogramming  levels  (Nb)  of  both 
three  and  six,  the  relationships  among  the  curves  is  most  similar 
to  that  of  case  Ic  of  fig.  2.  The  UTA  and  NOB  models  loosely 
bound  the  exact  mean  response  time  curve.  Note  that  the  batch 
load  in  case  II  has  a  shorter  mean  cpu  burst  than  does  the 
interactive  load.  If  the  batch  mix  had  been  less  io  bound,  the 
UTA  and  NOB  models  would  bound  the  exact  mean  response  time  curve 
more  tightly.  Measurements  obtained  by  Rose  [4]  resulted  in  very 
simlar  TSO  characteristics,  but  the  batch  programs  had  cpu  bursts 
about  four  times  longer  on  average  than  the  batch  jobs  in  the 
environment  described  here. 


Due  to  space  restrictions,  the  AIO  and  MIO  models 
included  in  the  figures.  In  most  cases,  their  predictio 
mere  pessimistic  than  SHD  but  more  optimistic  than  PSC. 
batch  jobs  were  cpu  bound  (cases  la  and  le) ,  the  AIO 
predictions  were  nearly  identical.  When  the  batch  jobs  b 
bound,  the  AIO  predictions  moved  much  closer  to  those 
while  the  MIO  predictions  remained  quite  pessimistic, 
model  provides  an  easy,  but  not  very  tight,  pessimisti 
without  requiring  multiple-class  analysis  algorithms, 
model  is  less  useful. 
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FIGURE  3.  CPU  BOUND  BATCH  JOBS 
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FIGURE  5.  MEASURED  STATISTICS 
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5.  CONCLUSIONS 
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is  very  small,  there  is  little  reason  not  to  use  it. 
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The  general  concept  of  a  "shadow  cpu"  can  be  usefully  applied  in 
more  general  priority  scheduling  discipline  contexts  than  the  one 
treated  in  this  paper.  For  example,  the  extension  to  multiple 
priority  levels  or  multiple  servers  with  priority  disciplines  is 
conceptually  straightforward,  although  the  iterative  calculations 
to  reach  self-consistent  parameters  become  much  more  complex. 
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IMPROVING  APPROXIMATIONS  OF 
AGGREGATED  QUEUEING  NETWORK  SUBSYSTEMS'^ 


K.C.  Sevcik,  A. I.  Levy,  S.K.  Tripathi,  and  J.L.  Zahorjan 


ABSTRACT 


Queueing  network  models  become  more  difficult  to  analyze  as 
they  include  more  detail  of  the  computer  systems  they  represent. 
Only  those  models  that  satisfy  local  balance  are  currantly 
susceptible  to  analysis  by  computationally  efficient  algorithms. 
Other  models  can  be  treated  by  simulation,  or,  if  small  enough, 
by  the  solution  of  global  balance  equations.  A  third  alternative 
is  to  use  an  approximate  solution  technique.  Such  techniques 
involve  altering  the  structure,  parameters,  or  assumptions  of  the 
model  in  order  to  permit  efficient  analysis.  Analysis  of  the 
approximate  model  yields  answers  close  to  those  that  would  result 
from  exact  analysis  of  the  original  model,  but  with  far  less 
computational  effort.  One  important  approximation  technique  is 
the  aggregation  of  subnetworks.  After  analysing  the  subnetwork, 
a  composite  server  replaces  it  in  a  reduced  model.  In  this 
paper,  we  examine  some  improved  methods  of  establishing  th 
characteristics  of  composite  servers.  Because  the  variancs  o 
interarrival  times  is  known  to  affect  the  queueing  properties  at 
a  single  server,  we  examine  the  influence  of  transition 
processes,  by  which  customers  circulate  among  the  service  centers 
of  a  queueing  network  model. 
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1.  Introduction 

Queueing  network  models  have  attained  acceptance  as  useful 
tools  in  assessing  and  predicting  the  performance  of  computing 
systems.  Efficient  computational  techniques  for  analyzing  these 
models  have  been  developed  for  queueing  networks  satisfying 
certain  restrictive  conditions.  While  these  conditions  have  been 
substantially  weakened  in  the  recent  years,  models  directly 
solvable  with  the  efficient  computational  algorithms  still  cannot 
fully  represent  the  complexity  of  the  actual  computer  systems 
that  we  would  like  to  study  [  BCBKTD], 

Models  for  which  efficient  computational  analysis  algorithms 
are  not  known  must  be  treated  using  other  techniques.  One 
approach  is  to  decompose  the  model  [CHWl,Cou],  For  example,  a 
subnetwork  can  be  independently  analyzed,  then  replaced  by  a 
single  composite  service  center,  selected  to  represent  the 
interaction  of  the  subnetwork  with  the  rest  of  the  system. 
Relative  to  direct  analysis  of  the  entire  network,  far  less 
computational  effort  is  required  by  analyzing  first  the 
subnetwork,  then  the  network  in  which  the  subnetwork  is 
represented  by  a  single  service  center.  Although  some  parts  of 
the  analysis  involve  approximations,  under  certain  conditions  the 
estimations  of  performance  measures  are  sufficiently  accurate  to 
be  useful  [Cou],  An  additional  motivation  for  aggregating 
subnetworks  is  to  increase  the  efficiency  of  repeated  experiments 
in  which  a  portion  of  a  queueing  network  stays  fixed  while 
parameters  are  altered  elsewhere  in  the  network.  Representing 
the  fixed  subnetwork  by  a  single  service  center  substantially 
reduces  the  computational  effort  required  in  each  trial. 

In  this  paper,  we  will  discuss  and  extend  techniques  that 
have  been  proposed  for  establishing  the  characteristics  of  a 
composite  service  center  intended  to  represent  an  entire 
subsystem  to  the  rest  of  the  network.  In  section  2,  we  define 
our  terminology  and  establish  the  context  of  this  work  by 
surveying  relevant  literature.  In  section  3,  we  establish  some 
approximations  for  characterizing  activities  that  occur  in 
queueing  networks.  The  approximations  are  intended  to  permit  a 
more  detailed  representation  of  the  patterns  in  which  programs 
use  computer  system  resources.  The  relationship  of  the 
approximations  to  the  aggregation  of  subnetworks  and  the  analysis 
of  entire  networks  is  presented  in  section  4,  and  examples  are 
discussed  in  section  5,  Section  6  contains  a  summary  of  our 
conclusions. 


2,  Terminology  and  Previous  Work 

A  li^iwork  model  of  a  computer  system  consists  of 
service  centers  representing  sdch  computer  system  resources  as 
processing  units,  channels,  peripheral  devices  and  terminals,  and 
customers ,  whose  usage  of  system  resources  is  represented  by  a 
sequence  of  visits  to  the  corresponding  service  centers. 
Customers  are  completely  characterized  by  their  class.  With  each 
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class  of  customers  and  each  service  center  are  associated  service 
time  distributions  and  transition  probabilities.  Samples  from 
the  service  time  distribution  determine  the  duration  of  a 
customer’s  use  of  a  service  center^  and  the  transition 
probabilities  govern  the  movements  of  customers  among  the  service 
centers.  In  this  paper,  we  treat  only  closed  networks,  in  which 
the  total  number  of  customers  in  the  network  is  constant. 

A  special  form  of  queueing  network  model  that  has  proven  to 
be  particularly  useful  in  modelling  computer  systems  is  the 
closed  central  server  model  with  a  single  class  of  customers 
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Peripheral  Servers 
Figure  1:  Central  Server  Model 


Exponential 


Figure  2:  Distributional  Forms 
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a  = 


r  = 


T  +  /icf  /(CVi2  +  1) 
where  T=1- (k-1) CViz 
(k-1  +a)  /Hi 


The  formulae  for  specifying  the  distribution  parameters  determine 
a  unique  distribution  within  the  family  of  distributions  that 
have  the  required  mean  and  coefficient  of  variation. 

For  the  HE2  distribution,  the  values  of  a,  r  and  s  are 
established  as  by  Sauer  and  Chandy  f SC ] .  The  GEr  distributian  of 
figure  2  differs  from  the  generalized  Erlang  structure  used 
previously  by  others.  With  this  GEr  structure,  all  customers 
pass  through  either  k-1  or  k  stages,  while  structures  used 
previously  permit  some  customers  to  pass  through  only  one  or  even 
zero  stages.  The  advantage  of  the  GEr  structure  used  here  is 
that  its  density  function  has  only  one  relative  maximum,  so  that 
it  resembles  empirically  observed  service  time  distributions  of 
low  variance. 

In  single  class  models  with  exponentially  distributed  service 
times,  most  performance  measures  of  interest  are  independent  of 
the  service  disciplines  used.  However,  in  multiple  class  models 
or  models  including  non-exponential  service  time  distributions, 
the  service  disciplines  must  be  considered.  The  two  disciplines 
that  we  will  use  are  f irst-come-f irst-served  (FCFS) ,  where 
customers  are  served  in  the  order  of  their  arrival  to  the  service 
center,  and  processor-sharing  (PS) ,  where  all  customers  at  the 
service  center  share  the  processor’s  power  equally.  A  server  is 
said  to  be  load  dependent  if  its  service  rate  changes  depending 
on  the  customer  mix  at  the  server. 


Queueing  networks  that  obey  certain  constraints  are  known  to 
satisfy  local  balance  and  to  have  produc  t  form  state 
probabilities  [CHT,GN, BCMP ].  The  product  form  can  be  exploited 
by  efficient  computational  algorithms  for  calculating  performance 
measures  of  interest  [  Buz,CHW1 ,RK2 ] .  Unfortunately,  local 
balance  is  precluded  by  the  presence  of  a  FCFS  service  center 
with  either  a  non-exponential  service  time  distribution  or  non¬ 
identical  exponential  service  time  distributions  for  different 
classes. 

Analysis  of  queueing  networks  that  do  not  satisfy  local 
balance  typically  requires  several  steps.  Either  some 
simplifying  assumptions  are  employed  to  modify  the  network  to 
satisfy  local  balance,  or  a  decomposition  approach  is  used.  In 
the  latter  case,  a  composite  server  replaces  a  subnetwork  in 
order  to  facilitate  analysis.  It  is  feasible  to  solve  the  set  of 
global  balance  equations  only  for  very  simple  networks. 


A  central  theme  of  this  paper  will  be  the  role  of  _ 

proce sses  in  queueing  network  analysis.  A  transition  process 
a  stream  of  events  that  mark  the  movement  of  customers  from 
place  to  another.  Each  service  center  has  an  arrival  (or  input) 


transi tion 
is 
one 
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process  of  arriving  jobs  and  a  departure  (or  ou 
completed  jobs.  Note  that  the  departure  process 
split  up  according  to  transition  probabilitie 
contributes  to  the  arrival  process  of  some  other 
We  will  characterize  a  transition  process  by  o 
the  coefficient  of  variation  of  the  distribution 
the  stream.  We  will  ignore  the  highe 
time  distributions,  and  also  the  fact 
are  not  necessarily  renewal  proce 


times  in 
interevent 


processes 

successive  interevent  times  are  not  independent  s 
overall  interevent  time  distribution.  The  most 
transition  process  is  the  Poisson  process,  in  w 
times  are  exponentially  distributed. 
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The  initial  work  on  the  use  of  a  composite  server  to 
represent  a  subsystem  in  a  queueing  network  model  was  reported  by 
Chandy,  Herzog  and  Woo  [CHW1].  They  defined  Norton's  theorem  for 
queueing  network  models,  which  is  analogous  to  Norton's  theorem 
of  electrical  circuit  theory.  A  Norton ' s  theorem  reduct  ion  in  a 
queueing  network  model  involves  replacing  a  subnetwork  that  has  a 
single  input  stream  and  a  single  output  stream  with  a  composite 
server.  The  composite  server  is  intended  to  behave,  with  respect 
to  the  rest  of  the  queueing  network,  exactly  as  did  the  original 
subnetwork.  By  studying  the  subsystem  to  be  replaced  in 
isolation  under  each  possible  customer  mix,  mean  throughput  rates 
conditioned  on  the  customer  mix  are  obtained.  To  do  this,  all 
service  centers  outside  the  subsystem  are  short-circuited  by 
either  removing  them  or,  equivalently,  by  giving  them  extremely 
high  service  rates.  The  scheduling  discipline  at  the  composite 
server  is  conceptualized  as  "composite  queueing"  (CQ)  [Tow].  CQ 
can  be  thought  of  as  providing  each  customer  class  with  a 
processor  whose  service  rate  depends  on  the  current  mix  of 
customers  in  the  subsystem.  For  each  class  and  customer  mix,  the 
rate  of  the  processor  is  set  to  be  the  throughput  of  the 
subsystem.  The  customers  of  each  class  are  served  at  their 
processor  according  to  a  PS  discipline.  For  example,  if  a 
particular  customer  mix  in  the  subsystem  includes  three  class  A 
customers  and  causes  the  class  A  throughput  in  the  subsystem  to 


be  1/2, 
rate  1/2, 
rate  1/6. 


then  class 
and  each  of 


A's  CQ  processor  with  that  customer  mix  has 
the  three  class  A  customers  is  serve!  at 


When  a  queueing  network  satisfies  local  balance,  a  Norton's 
theorem  reduction  of  a  subsystem  is  exact  in  that  the  joint 
probability  distribution  of  queue  lengths  at  servers  not  in  the 
subsystem  is  identical  in  the  original  and  reduced  systems.  For 
each  class,  the  service  time  distribution  at  the  composite  server 
can  be  assumed  exponential  with  the  required  mean  to  assure  the 
appropriate  throughput.  In  locally  balanced  networks,  the  choice 
of  service  discipline  within  each  class  is  irrelevant  since  it 
does  not  alter  the  probability  distribution  of  system  state. 


Sauer  and  Chandy  have  investigated  the  direct  application  of 
Norton's  theorem  reductions  to  queueing  networks  that  do  not 
satisfy  local  balance  [SC],  In  this  case,  the  network  containing 
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the  composite  server  provides  only  an  approximation  to  the 
original  network.  The  step  in  which  the  subsystem  is  examined  in 
isolation  involves  an  implicit  assumption  that  the  input  process 
to  the  subsystem  is  identical  to  its  output  process.  This 


assumption  is  not  valid  in  networks  that 
balance  (generally,  only  mean  input 
identical)  and  causes  the  reduction  to  be 
is  a  motive  to  give  closer  consideration 
into  and  out  of  the  subsystem,  treating 
than  the  first  moment.  Specifically,  in 
we  will  study  the  benefits  of  involving 


do  not  satisfy  local 
and  output  rates  are 
inexact.  Thus,  there 
to  transition  processes 
characteristics  other 
the  following  sections, 
the  second  moments  of 


interevent  times  in  representing  a  subsystem. 

Since  direct  application  of  Norton's  theorem  reductions  in 
non-locally-balanced  networks  can  result  in  unacceptable  error  in 
the  estimation  of  performance  measures  [CHW2],  Sauer  and  Chandy 
have  incorporated  the  reduction  into  several  higher  level 
approximation  algorithms,  each  designed  to  be  applicable  to  a 
specific  class  of  non-locally-balanced  networks  [SC],  One  olass 
of  networks  they  treat  is  the  class  of  central  server  models  in 
which  one  or  more  FCFS  service  centers  have  non-e xponenti all y 
distributed  service  times.  Their  approach  starts  by  assuming 
exponentially  distributed  service  times  at  each  service  center  in 
the  subnetwork  to  be  reduced.  A  Norton's  theorem  reduction  is 
applied  to  reduce  all  the  peripheral  servers  to  a  composite 
server.  Next,  the  non-exponential  nature  of  service  time 
distributions  is  accounted  for  by  computing  an  estimate  of  the 
coefficient  of  variation  to  be  reflected  in  the  service 
of  the  composite  server.  Then  both  the  central  server 
composite  server  are  represented  by  HE2,  Exp,  or  GEr  service 
structures  according  to  the  estimated  coefficients  of  variation. 
The  method  of  determining  the  coefficient  of  variation  to  be  used 
for  the  composite  server  is  important  and  will  be  discussei  in 
detail  later.  Having  reduced  the  system  to  two  servers,  it  can 
be  solved  by  global  balance  or  other  techniques  that  are  not 
contingent  on  the  presence  of  local  balance  [HWC], 


structure 
and  the 


Three  potential  sources  of  error  can  be  distinguished  in  the 
direct  application  of  a  Norton's  theorem  reduction  in  a  network 
that  does  not  satisfy  local  balance. 

(1)  Exponential  Assumption  —  Because  all  service  time 

distributions  are  assumed  exponential  when  analyzing  the 
subsystem,  errors  in  the  estimation  of  throughput  rates 
may  result, 

(2)  CV  Omission  —  The  coefficient  of  variation  of  the 

composite  server  is  set  to  one,  regardless  of  the 
characteristics  of  the  service  processes  at  the 
individual  service  centers. 


(3)  Input/Output  Assumption  —  By  ignoring  the  rest  of  the 
network  while  analyzing  the  subsystem,  an  implicit 
assumption  is  made  that  the  input  process  and  the  output 
process  of  the  subsystem  are  identical. 
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The  tools  to  be  developed 
4 ,  to  treat  each  of  these 


in  section  3  will  permit  us,  in 
sources  of  error  individually. 


section 


3  . 


Transformations  of 


Transition  Processes. 


Because  much  is  known  about  queueing  systems  to  which  the 
arrival  process  is  Poisson,  roost  approximations  made  in 
simplifying  queueing  networks  assume  that  certain  transition 
processes  are  Poisson,  when  they  actually  may  not  be.  In  this 
section,  we  will  develop  some  approximations  that  indicate  how 
the  first  two  moments  of  the  interevent  times  of  a  transition 
process  are  affected  when  the  event  stream  is  split,  merged  with 
other  streams,  or  routed  through  a  service  center  with  a  general 
service  time  distribution.  In  the  next  section,  we  will  show  how 
the  inclusion  of  additional  information  about  transition 
processes  can  be  incorporated  to  avoid  some  of  the  potential 
errors  in  methods  of  aggregating  subnetworks. 


For  service  centers  satisfying  the  characteristics  of  local 
balance,  Muntz  has  shown  that  a  Poisson  arrival  process  will 
guarantee  a  Poisson  departure  process  [Mun].  In  general, 
however,  the  departure  process  depends  on  the  arrival  process  as 
well  as  the  service  discipline  and  the  service  time  distribution 
at  the  service  center.  In  a  queueing  network,  the  departure 
process  from  a  service  center  may  indirectly  affect  its  own 
arrival  process.  In  fact,  the  arrival  process  to  any  service 
center  may  depend  on  the  entire  network. 

Bather  than  characterize  each  transition  process  by  a  single 
parameter,  its  mean  rate,  we  will  involve  a  second  parameter, 
namely,  the  variance  (or,  equivalently,  the  coefficient  of 
variation)  of  the  interevent  times.  One  danger  to  keep  in  mind 
is  that,  in  general,  successive  interevent  times  will  not  be 
independent  selections  from  the  overall  distribution  of 
interevent  times.  Some  results  are  available  on  the  serial 
correlation  of  interevent  times  [Dal,  Mar],  however,  for 
mathematical  convenience,  we  follow  the  tradition  of  assuming 
that  transition  processes  are  renewal  processes  [DC,  GP]. 


As  a  first  step,  we  examine  a  single  server  in  isolation,  and 
try  to  specify  the  mean  and  coefficient  of  variation  of 
interdeparture  times  in  terms  of  the  means  and  coefficients  of 
variation  of  the  interarrival  and  service  times.  We  assume  that 
the  service  discipline  is  FCFS.  Clearly,  unless  the  queue  is 
saturated,  the  mean  inter departure  time  and  the  mean  interarrival 
time  are  equal,  so  we  only  need  to  seek  the  coefficient  of 
variation  of  the  departure  process. 


While  our  ultimate  goal  is  to  determine  the  departure  process 
that  results  from  a  general  arrival  process  and  a  general  service 
time  distribution,  it  is  useful  to  examine  special  cases  in  which 
either  the  interarrival  times  or  the  service  times  are 
exponentially  distributed.  When  both  interarrival  times  and 
service  times  are  exponentially  distributed,  it  is  known  that  the 
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inter departure  times  are  also 
[Bur,Mun],  When  the  arrival  process 
times  have  a  general  distribution,  CV 


exponentially 
is  Poisson  but 
d2  is  given  by  [ 


distributed 
the  service 
DC] 


CVd2  =  1  +  r2  (CVs2~1 )  (1) 

where  r  =  Hs/Ma,  and  a,s,  and  d  denote,  respectively,  the 
arrival,  service,  and  departure  processes. 

Since  r=Ms/Ma  is  the  loading  or  utilization  of  the  server,  the 
higher  the  utilization,  the  more  the  variation  in  service  times 
governs  the  variation  in  the  departure  process.  Equation  (1)  is 
consistent  with  the  intuition  that  as  the  utilization  approaches 
zero,  the  departure  process  becomes  identical  to  the  arrival 
process,  and  as  the  utilization  approaches  one,  the  departure 
process  becomes  identical  to  the  service  process.  Finally,  note 
that  CVd2  is  never  farther  from  one  than  is  CVs2. 


When  service  times  are  exponentially  distributed  but  the 
arrival  process  is  not  Poisson,  CVd2  is  again  approximately 
proportional  to  CVa2,  simulation  experiments  involving  HE2  and 
GEr  interarrival  time  distributions  indicate  an  approximation  for 
CVd2  that  is  symmetric  to  (1)  : 


CVd2  =  1  +  (1-r2)  (cva2-l) 
where  r  =  Ms/Ma, 


(2) 


Again  the  limiting  cases  of  utilization  at  zero  and  one  c 
departure  process  to  approach  the  arrival  process  or  the 
process  respectively,  and  again  CVd2  is  no  farther  from  1 
CVa2. 


ause  the 
ser  vice 
than  is 


When  both  CYa2  and  CVs2  differ  from  1,  the  relationship  to 
CVd2  is  less  intuitive.  The  asymptotic  dependence  of  the 
departure  process  on  the  arrival  process  at  low  load  and  on  the 
service  process  at  high  load  still  holds.  The  simplest 
generalization  of  formulae  (1)  and  (2)  is: 


CVd2  =  1  +  r2(cys2-1)  +  ( 1-r2)  (CVa2-1 ) 


(3) 


Simulation  experiments  indicate  that  (3)  yields  good  predictions 
of  CVd2  for  a  range  of  CVs2  and  CVa2  values.  Figure  3  compares 
predicted  and  simulated  values  of  CVd  as  a  function  of  CVa  with 
CVs=4.  (The  third  curve  is  based  on  [ GP ]  and  will  be  discussed 
in  section  6.)  Note  that  formula  (3)  guarantees  that 


|CVd2-1|  <  max(|CVa2-l I ,  |CVs2-1j) 
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Figure  3:  Departure  Processes 


Simulation 
Approx  (Eq. 3) 
Approx  (Eq.  16) 

Simulation 

Approx  (Eq.  3) 
Approx  (Eq.  16) 


Figure  4:  Transition  Processes 
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Very  useful  properties  of  Poisson  streams  are  that  splitting 
them  (choosing  a  substreara  by  a  Bernoulli  trial)  and  merging  them 
both  result  in  Poisson  streams.  In  the  current  context,  we  need 
analogous  results  for  splitting  and  merging  non-Poisson  streams. 
We  cannot  hope  for  such  aesthetic  results  as  arise  in  the  Poisson 
case,  but  we  can  attempt  to  determine  the  mean  and  coefficient  of 
variation  of  the  split  or  merged  streams  in  terms  of  the  same 
characteristics  of  the  contributing  stream(s).  We  will  discuss 
only  two-way  splitting  and  two-way  merging.  N-way  splits  and 
merges  can  be  treated  as  sequences  of  two-way  operations. 


We  first  consider  splitting  a  non-Poisson  stream  (x)  into  two 
substreams  (y1  and  y2)  according  to  Bernoulli  trials  where  the 
probability  of  being  directed  to  y1  is  p,  and  to  y2,  1-p,  The 
mean  and  coefficient  of  variation  of  interevent  times  in  stream 
y1  are  given  by: 


My1  =  Mx/p  (4) 

CVy12  =  1  +  p(CVx2-l)  (5) 

The  equation  for  the  mean  is  easy  to  derive  from  the  fact  that 
proportion  p  of  the  events  are  routed  to  stream  y1 ,  The 
derivation  of  the  coefficient  of  variation  is  based  on  the  fact 
that  each  interevent  time  in  stream  y1  is  made  up  of  one  or  more 
interevent  times  from  stream  x.  Because  routing  is  based  on 
independent  Bernoulli  trials,  the  number  of  stream  x  interevent 
times  that  composes  each  stream  y1  interevent  time  is 
geometrically  distributed.  Thus,  the  interevent  time 
distribution  of  stream  y1  is 

oo 

f(t)  -  I  P(l-P)^‘^g(k,t) 

k=l 

where  g(k,t)  is  the  k-fold  convolution  of  the  stream  x 
interevent  time  distribution. 


Equation  (5)  can  be  derived  from  f  (t) 


Equation  (5)  guarantees  that  CVy12  is  no  further  from 
than  is  CVx2.  Intuitively,  the  basis  for  this  is  the  presenc 
the  geometric  distribution  mentioned  above.  The  coefficien 
variation  of  the  geometric  distribution  is  one.  Note  tha 
substream  selected  with  probability  p  becomes  asymptotic 
Poisson  as  p  approaches  zero.  Kobayashi  presents  a  formula 
identical  form  to  that  of  the  equation  (5)  as  an  approximatio 
one  component  of  the  arrival  process  to  a  given  service  ce 
[Kob],  His  equation  uses  the  squared  coefficient  of  variatio 
service  times  where  we  use  CVx2,  and  thus  he  requires  that 
server  at  which  the  arrival  process 
heavily  loaded.  By  using  the  coefficient 
output  process  instead  of  the  service 
assumption  is  avoided  and  the  equation  is 
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A  pleasing  feature  of  equations  (4)  and  (5)  is  that  the  first 
two  moments  of  the  original  stream  fully  determine  the  first  two 
moments  of  each  substream  resulting  from  the  splitting.  This 
holds  for  all  distributional  forms  assuming  no  serial  correlation 
of  interevent  times. 


The  merging  of  non-Poisson 
[DC].  Even  if  two  contributing 
correlation  of  interevent  times 
merged  may  have  substantial  c 
situations  in  which  this  effect  i 
the  interevent  time  distributions 
extremely  low  coefficients  of 
situations  occurs  commonly  in  the 
use  the  results,  so  we  proceed 
wish  to  express  the  first  two  mom 
the  merged  stream  (y)  in  terms 
the  contributing  streams  (x1  and 
merged  stream  is  the  sum  of 
streams. 


streams  is  somewhat  more  complex 
streams  each  have  no  serial 
,  the  stream  that  they  form  when 
orrelation.  Fortunately,  the 
s  most  serious  occur  either  when 
are  discrete  or  when  they  have 
variation.  Neither  of  these 
contexts  in  which  we  wish  to 
as  in  the  case  of  splitting.  We 
ents  of  the  interevent  times  in 
of  the  known  characteristics  of 
x2) .  Since  the  rate  of  the 
the  rates  of  the  contributing 


Wy  =  (Mx1*Mx2) /( Mx1+mx2) 


(6) 


Unfortunately,  CVy2  can 
Mx2,  CVx12,  and  CVx22.  f 
whose  interevent  time  di 
and  one  of  the  family  of 
variance.  The  coeffici 
in  the  merged  stream 
hyperexponential  distrib 
members  have  the  same  mea 


not  be  expressed  in  terras  of  only  Mxl, 
or  example,  consider  merging  two  streams 
stributions  are  respectively  exponential 
hyperexponentials  with  a  given  mean  and 
ent  of  variation  of  the  interevent  times 
depends  on  which  member  of  the 
ution  family  is  chosen,  even  though  all 
n  and  coefficient  of  variation  [Sev]. 


One  way  to  determine  the  characteristics  of  the  merged  stream 
is  to  express  its  interevent  time  distribution  in  terms  of  the 
char acter sties  of  the  contributing  streams.  Each  event  in  stream 
x1  starts  a  stream  y  interevent  time  drawn  from  the  cumulative 
d istribution 

Fy1(t)  =  Fx1  (t)  +  (1-Fx1  (t)  )  RMx2  (t) 
where  RMx2(t)  =  Cl-Fx2 (v))dv 

CL  J  y_Q 

is  the  cumulative  distribution  function  of  a  "random 
modification"  of  the  stream  x2  interevent  time 
distribution  [CMM,  p.146]. 


Intuitively,  this  reflects  the  fact  that  the  next  stream  y 
event  can  occur  within  time  t  in  one  of  two  ways:  either  the 
next  event  in  x1  occurs  within  time  t  or,  otherwise,  the  next 
event  in  x2  occurs  within  time  t.  The  independence  of  the  x1  and 
x2  streams  justifies  the  assumption  that  the  time  from  an  x1 
event  to  the  next  x2  event  has  the  distribution  RMx2 (t)  . 
Symmetrically,  the  cumulative  distribution  of  interevent  times 
initiated  by  stream  x2  events  is 
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Fy2(t)  =  Fx2(t)  +  {  1-Fx2(t)  )  RMxl  (t) 


Since  the  proportion  of  x1  events  in  stream  y  is 
( l/Mx  1) /  (1/Mx1 +1 /Mx2)  ,  the  cumulative  distribution  of  interavent 
times  in  stream  y  is 


Fy(t)  =  [  Mx2  (Fyl  (t)  ) +Hx1  (Fy2  (t)  )  ]/ (Mx1  +  Mx2) 

The  mean  and  coefficient  of  variation  of  the  interevent  times 
in  stream  y  can  be  determined  from  Fy  (t) .  However,  the  symbolic 
integration  required  is  lengthy  for  all  but  the  simplest 
distributional  forms  of  Fxl  (t)  and  Fx2  (t)  [Sev].  Even  for  HE2 
and  GEr  distributions,  the  expression  for  the  CV  of  the  merged 
stream  contains  so  many  terms  that  it  is  useful  only  for 
numerical  evaluation. 

In  related  investigations,  the  following  approximation  has 
been  proposed: 


CVy^  =  J(My/Mxi)CVxi^ 

X 


(7) 


Kobayashi  expresses  the  arrival  process  at  one  center  in  terms  of 
the  transition  probabilities  and  the  service  processes  at  other 
centers  [Kob].  Gelenbe  and  Pujolle  refine  the  expression  by 
giving  the  departure  process  at  a  service  center  in  terms  of  its 
service  process  and  the  departure  processes  at  other  centers, 
thus  removing  the  need  for  a  heavy-load  assumption  [GP],  When 
distilled  to  an  expression  for  aggregating  non-Poisson 
substreams,  each  of  the  above  expressions  reduces  to  equation 
(7) .  Sauer  and  Chandy  use  a  closely  related  formula  in  which  CVy 
and  CVxi  replace  their  squared  counterparts.  The  coefficient  of 
variation  of  the  composite  server  resulting  from  a  Norton’s 
theorem  reduction  is  set  to  be  the  weighted  sum  of  the 
coefficients  of  variation  of  the  service  processes  at  the  service 
centers  being  aggregated  [SC], 


While  equation  (7)  is  exact  when  all  substreams  are  Poisson, 
there  are  other  special  cases  in  which  it  is  not  realistic.  For 
example,  if  all  the  substreams  being  merged  have  the  same  CV^, 
equation  (7)  causes  the  merged  stream  to  have  that  CV2  also. 
Actually,  merging  two  streams  with  CV's  less  than  one  will  yield 
a  stream  with  somewhat  larger  CV.  Similarly,  merging  two  streams 
with  CV  greater  than  one  yield  one  with  somewhat  lower  CV. 
Finally,  it  is  known  that  merging  a  large  number  of  independent 
streams  of  arbitrary  characteristics  yields  a  nearly  Poisson 
stream. 


A  slight  modification  of  equation  (7)  avoids  the  problems 
indicated  above: 


CVy^  =  1  +.JCMy/Mxi)^  (CVxi^-1) 
1 


(8) 
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The  exponent  of  the  (My/Mxi)  term  was  chosen  rather  arbitrarily 
based  on  a  few  specific  cases.  Any  choice  in  the  range  1.5  to 
about  3  seems  to  make  equation  (8)  more  realistic  than  equation 
(7).  Note  that  equation  (8)  reduces  to  equation  (7)  when  the 
exponent  is  set  to  one.  We  will  use  equation  (8)  for  estimating 
the  CV  that  results  from  merging  non-Poisson  streams. 

In  this  section,  we  have  developed  tools  for  approximating 
how  a  transition  process  can  be  split,  how  each  substream  is 
affected  by  passing  through  a  service  center  with  a  general 
service  time  distribution,  and  how  the  streams  can  be  merged. 
With  these  tools,  we  are  able  to  suggest  some  improved  techniques 
for  aggregating  subnetworks  and  for  analyzing  entire  queueing 
network  models. 


4.  Approximate  Analysis  Techniques  Using  Transition  Processes 

In  this  section,  we  describe  a  sequence  of  successively  more 
complex,  more  accurate,  and  moie  expensive  approximate  analysis 
techniques  based  on  the  transition  process  transformations 
developed  in  section  3.  Each  will  be  aimed  at  reducing  the  error 
resulting  from  one  or  more  of  the  three  potential  sources  of 
error  in  using  a  direct  Norton’s  theorem  reduction  on  a  central 
server  network  with  a  single  customer  class,  non-exponential 
service  times  and  FCFS  service  discipline  at  each  service  center. 
These  three  potential  sources  of  error  were  named  Exponential 
Assumption,  CV  Omission,  and  Input/Output  Assumption  at  the  end 
of  section  2. 


4.1  Improved  CV  Estimation 


Direct  application  of  Norton’s  theorem  reduction  to  central 
server  models  with  non-exponential  service  times  and  FCFS  service 
disciplines  can  lead  to  unacceptable  errors  in  performance 
measure  estimates  even  when  the  CV’s  of  service  time 
distributions  differ  from  one  by  factors  of  five  or  less  [CHWl], 
(Empirically  observed  CV’s  frequently  differ  from  one  by  factors 
far  larger  than  five.)  Sauer  and  Chandy  try  to  remedy  the  CV 
Omission  error  of  direct  Norton’s  theorem  reductions  by 
estimating  a  coefficient  of  variation  to  be  used  in  the  composite 
server.  As  was  mentioned  in  the  previous  section,  their  estimate 
is  calculated  from  the  formula: 


CVc  -  JPi(CVi)  (9) 

1 

where  Pi  is  the  probability  of  transition  from  the 
central  server  to  the  ith  peripheral  service 
center, 

and  CVi  is  the  coefficient  of  variation  of  the  service 
time  distribution  at  the  ith  peripheral  server. 
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Then,  depending  on  CVc,  a  distributional  form  is  chosen  for  the 
composite  server.  If  CVc  is  significantly  greater  than  one,  HE2 
is  chosen,  if  it  is  significantly  less  than  one,  GEr  is  chosen, 
and  otherwise  Exp  is  chosen.  A  unique  distribution  is  determined 
by  setting  parameters  according  to  formulae  similar  to  those 
given  in  section  2. 


Experiments  with  this  refined  form  of  aggregate  server  show 
it  to  give  good  results  over  a  wide  range  of  models  [SC].  Thus, 
simply  representing  the  variance  in  service  times  at  the 
peripheral  servers  by  giving  the  composite  server  a  non¬ 
exponential  distributional  form  seems  to  be  helpful. 

It  is  not  difficult,  however,  to  identify  reasonable  models 
for  which  the  approach  involving  CVc  does  not  yield  acceptable 
results.  For  example,  when  the  coefficients  of  variation  of 
service  times  at  the  peripheral  devices  differ  from  one  by  a 
factor  of  five  or  more,  errors  in  utilization  and  mean  queue 
length  predictions  can  reach  twenty  percent  or  more. 

There  is  little  intuitive  basis  for  choosing  CVc  according  to 
equation  (9),  although  it  is  exact  for  the  special  case  in  which 
all  service  time  distributions  are  exponential  so  that  the  system 
satisfies  local  balance.  In  a  central  server  model,  strict 
modelling  of  the  aggregate  service  process  leads  us  to  calculate 
CVc2  from 


)  +  JPiMi^  -  J(PiMi)^ 
i _ i _ 

(PiMi) ] ^ 

(1C) 

This  formula  yields  a  coefficient  of  variation  that  is  exactly 
the  coefficient  of  variation  of  the  overall  distribution  of 
service  times  in  the  subsystem.  The  parallelism  of  service  in 
the  subsystem  is  represented  by  adjusting  the  service  rate,  with 
a  higher  rate  indicating  greater  parallelism. 


CVc 


JPi(CVi^Mi^ 
2  1 


a 

1 


Experimentation  has  shown  that  in  some  queueing  networks, 
calculating  CVc^  according  to  equation  (10)  yields  significantly 
better  results  than  using  equation  (9),  while  in  other  cases,  the 
opposite  is  true  [TL].  Thus,  only  having  the  composite  server 
reflect  the  variance  in  the  aggregate  service  time  distribution 
does  not  help  consistently. 


When  replacing  a  subsystem  wit 
is  to  make  the  composite  server  res 
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consistently  well  is  a  strong  argum 
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to  attempt  to  represent  the  service  process.  Instead,  we  must 
attempt  to  represent  the  output  process. 

The  composite  server  may  have  any  one  of  a  number  of  service 
disciplines.  Its  mean  service  rate  must  be  load  dependent  so 
that  the  required  throughput  rate  for  each  multiprogramming  level 
can  be  achieved.  While  any  service  discipline  may  be  chosen,  we 
will  give  special  attention  to  FCFS  and  PS.  (A  load  dependent 
version  of  PS  is  simply  the  single  class  special  case  of  CQ. )  In 
a  locally-balanced  queueing  network  model,  PS  eliminates  the 
effect  of  variance  in  the  service  time  distribution,  making  the 
state  probability  distribution  identical  to  that  with  FCFS 
service  and  an  exponential  service  time  distribution.  When  the 
system  does  not  satisfy  local  balance,  however,  the  variance  of 
service  time  distribution  at  PS  service  centers  does  affect  the 
system  state  probability  distribution.  We  conjecture  that  either 
FCFS  or  PS  at  the  composite  server  can  yield  equally  effective 
representations  of  the  subsystem’s  output  process,  as  long  as  the 
service  time  variance  is  adjusted  appropriately. 

Based  on  the  developments  of  section  3,  Sauer  and  Chandy's 
algorithm  can  be  improved  by  simply  replacing  the  estimation  of 
CVc2.  The  simplest  generalization  of  the  Sauer  and  Chandy 
formula  is 

CVc^  =  1  +  iPi^(CVsi^-l) 

i  '  ' 

This  is  equation  (8)  for  aggregating  substreams,  where  the 
substreams  are  the  service  processes.  It  is  only  accurate  when 
all  the  service  centers  are  heavily  loaded. 

If  a  good  estimate  of  the  throughput  (or  mean  interarrival 
time)  of  the  subsystem  is  available,  a  more  sophisticated  formula 
can  be  used: 

CVc^  =  1  +  JPi'^(Msi/Mc)^(CVsi^-l)  (12) 
i 

Equation  (12)  is  derived  from  equations  (1)  and  (8)  assuming 
Poisson  input  to  the  peripheral  service  centers.  It  does  not 
assume  the  service  centers  to  be  heavily  loaded.  The  results  of 
using  various  ways  of  estimating  CVc^  will  be  discussed  in 
section  5. 


4.2  Improved  Throughput  Estimation 


In  order  to  reduce  the  error  in  throughput  rate  calculation, 
it  is  necessary  to  avoid  the  Exponential  Assumption.  This 
eliminates  the  possibility  of  using  the  efficient  computational 
techniques  that  require  local  balance  in  order  to  analyse  the 
subsystem.  An  alternative  approach  has  been  developed  by 
Zahorjan  [Zah].  He  proposes  an  approximate  solution  technique 
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based  on  the  global  balance  method  to  obtain  load  dependent 
throughput  rates  appropriate  to  the  actual  service  time 
distributions  of  the  subnetwork.  The  problem  with  using  global 
balance  to  solve  for  the  throughputs  exactly  is  that  the  number 
of  simultaneous  linear  equations  which  must  be  solved  grows 
combinatorially  with  the  number  of  service  centers,  stages  of 
service  at  each  center,  and  customers  in  the  system.  Even  for 
subnetworks  of  moderate  complexity,  a  global  balance  solution 
would  be  infeasible. 
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computational  complexity  of  the  global  balance  technique 
eatly  reduced  by  repeated  applications  of  Norton's 
reductions  aggregating  only  a  small  number  of  service 
probably  two)  at  each  step  but  using  global  balance 
to  determine  the  subnetwork  throughput.  The  load 
server  formed  at  each  step  then  replaces  the 

ding  service  centers  in  the  original  network,  ani  the 
reductions  continues  until  there  is  only  a  single  load 
server.  In  this  way  the  total  amount  of  work  required 
the  system  grows  only  linearly  with  the  number  of 
centers,  and  the  total  storage  requirement  at  any  point 
mputation  is  much  smaller  than  it  would  be  if  the  entire 
consisting  of  all  the  centers  were  solved  using  global 
guations.  The  tradeoff  involved  in  deciding  how  many 
centers  to  aggregate  at  each  step  is  that  the  complexity 
lution  increases  greatly  as  the  number  of  centers  grows, 
error  of  the  approximation  decreases  because  fewer 
ons  of  the  approximation  technique  are  required. 


A  program  capable  of  solving  a  restricted  class  of  single 
class  queueing  networks,  including  the  central  server  model,  has 
been  implemented  [Zah].  The  errors  in  performance  measure 
predictions  are  significantly  smaller  than  those  obtainei  by 
treating  only  the  CV  Omission  error  as  was  done  in  section  U,1, 
The  error  which  does  occur  is  due  to  the  Input/Output  Assumption. 
The  accuracy  of  this  approximation  will  be  illustrated  with  an 
example  in  section  5. 


4.3  Improved  Tnput/Output  Process  Characterization 

Once  the  Exponential  Assumption  is  avoided,  the  most 
significant  remaining  error  source  is  the  Input/Output  Assumption 
where  the  input  process  and  output  process  of  a  subsystem  are 
presumed  to  be  identical.  In  order  to  remove  this  assumption,  we 
must  consider  the  subsystem's  environment.  In  a  central  server 
models,  this  involves  extending  equation  (12)  to  include  the 
effect  of  the  central  server.  Assuming  Poisson  input  to  the 
central  server,  equations  (3),  (5),  and  (8)  can  be  composed  to 
express  the  coefficient  of  variation  of  the  subsystem's  departure 
process  (see  figure  4)  ; 

CVa4^  '  1  ♦ 

i 


(13) 


) (CVaO  -1)]) 
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By  assuming  some 
calculate  CVa42.  In 
since  the  departure 
arrival  process  at 
equation  (13)  reduces 


CVaO^ 


1  + 


I  Pi^{( 


PiMsi 

MaO 


value  for  CVaO^ ,  equation  (13)  allows  us  to 
a  closed  central  server  model,  however, 
process  from  the  subsystem  (a4 )  is  the 
the  central  server  (i.e.,  CVa02 =cVa42 ) , 


to: 

)^CVsi2-l)♦Pi(l-(?^)^  [C^)^ 


^MaO  ■ 


(CVsO^-1)]} 


(14) 


1  . 

To  use  the  coefficient  of  variation  obtained  from  equation 
(14) ,  we  modify  the  load  dependent  composite  server  specified  by 
Zahorjan  by  giving  it  a  structure  dependent  on  this  estimate  of 
coefficient  of  variation. 


5,  Examples 


In  this  section,  we  examine  two  examples  and  compare  the 
accuracy  achieved  by  the  proposed  techniques.  The  exact  answers 
are  obtained  by  using  QSOLVE,  a  global  balance  solution  system 
[Lev]. 

The  parameters  and  results  of  example  1  are  shown  in  figure 

5.  The  methods  that  employ  the  Exponential  Assumption  all 
produce  similar  answers  even  though  they  deal  differently  with 
the  CV  Omission  error.  Since  the  exact  answer  and  Zahorjan' s 
answer  lie  substantially  below  the  others,  it  appears  that  the 
Exponential  Assumption  is  more  critical  than  the  CV  Omission. 
Once  throughput  is  estimated,  the  manner  of  setting  CVc2  is  not 
critical.  Thus,  seeking  less  arbitrary  formulae  for  CVc2  doss 
not  seem  to  be,  by  itself,  worthwhile.  Moreover,  in  this  case, 
even  direct  Norton's  theorem  reduction  does  better  than  the  more 
detailed  approaches. 

Zahorjan' s  approach,  which  corrects  for  the  Exponential 
Assumption,  gives  consistently  good  approximations.  Over  a  wide 
range  of  examples,  the  largest  utilization  and  mean  queue  length 
errors  observed  were  approximately  five  percent. 

The  structure  and  results  for  example  2  are  shown  in  figure 

6.  Having  seen  the  importance  of  accurate  estimation  of 
throughput  in  example  1,  we  select  here  exponential  servers  at 
the  peripheral  service  centers,  thus  avoiding  any  throughput 
error  due  to  the  Exponential  Assumption.  In  this  situation,  most 
of  the  approaches  that  we  have  discussed  are  equivalent  since 
they  all  treat  the  special  case  of  exponential  peripheral  service 
times  identically. 

If  the  arrival  process  at  the  central  server  is  Poisson, 
however,  then  the  departure  process  has  the  coefficient  of 
variation  given  by  equation  (13).  Further,  since  the  network  is 
of  the  central  server  form,  the  subsystem's  departure  process  is 
identically  the  arrival  process  at  the  central  server.  Hence, 
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equation  (14)  can  be  used  to  establish  the  coefficient  of 
variation  of  the  transition  process  between  the  subsystem  and  the 
central  server. 

The  use  of  equation  (14) ,  after  establishing  throughput  rates 
by  Zahorjan's  pair-wise  aggregation  technique,  gives  a  good 
estimation  of  the  coefficient  of  variation  of  the  arrival  process 
at  the  central  server.  This  CV2  can  then  be  used  with  the 
composite  server's  load  dependent  throughput  rates  to  improvs  the 
accuracy  of  Zahorjan's  approximation. 

6.  Summary  and  Conclusions 

In  section  3,  we  established  formulae  for  approximating  the 
coefficient  of  variation  of  transition  processes  in  queueing 
network  models.  In  terms  of  figure  4,  the  formulae  are 
summarized  below: 

Splitting :  For  i= 1,2,3, 

Mai  =  MaO/Pi 

CVai2  =  1  +  Pi(CVd02-1) 

Queueing:  For  i=0,1,2,3, 

Mdi  =  Mai 

CVdi2  =  1  +  (Msi/Mai)  2  (cvsi2-l)  +  [  1- (Msi/Mai)  2  ]  (cvai2-l ) 

M  e  ^i  nq : 

MaO  = 

i 

CVaO^  =  1  +  (MaO/Mdi)^(CVdi^-l)] 
i 


In  the  context  ,  of  diffusion  approximations,  others  have 
obtained  results  related  to  ours  [Gel,  Kob],  We  state  them  here 
with  the  notation  of  figure  4. 

(a)  If  CVai2  =  1  for  a  single  queue, 

CVdi2  =  1  +  (Msi/Mai)  2  (cvsi2-1)  [DC] 


(b)  If  MsO/MaO  =  1  in  a  central  server  model, 

CVai2  =  1  +  Pi(CVs02-1)  [Kob] 


(c)  If  QjO  is  the  transition  probability  between  center  j  and 
center  0  in  a  general  queueing  network, 

2 

CVaO^  •=  1  +  MaOjC^^HCVsj  ^-1)  [EKI] 

(d)  With  QjO  defined  as  above. 


CVdO' 


rMsO. 


1 


.  2 


[GP] 
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Result  (a)  is  the  special  case  of  equation  (3)  with  Poisson 
arrivals^  and  result  (b)  is  the  special  case  of  equation  (5)  in 
which  the  central  server  is  heavily  loaded.  In  the  case  of  a 
central  server  model,  result  (c)  (with  Qj0=1)  specializes  to 
equation  (7)  rather  than  to  equation  (8) ,  which,  as  we  have  seen, 
represents  merging  more  accurately.  When  all  customers  at 
service  center  zero  come  directly  from  a  single  service  center 
(say  j) ,  result  (c)  further  specializes  to; 

CVa02  =  cvsj2  (15) 

This  makes  the  heavy  load  assumption  obvious. 


Result  (d)  has  numerous  special  cases  of  interest.  It 
specializes  to  equation  (5)  for  splitting,  and,  like  result  (c)  , 
it  specializes  to  equation  (7),  rather  than  equation  (8),  for 
central  server  models.  When  all  center  zero  customers  come  from 
center  j,  result  (d)  specializes  to: 


CVdO  2  =  1  -I-  r2  (c  Vs02-1) 
where  r  =  MsC/MaO 


+  (1-r)  (CVd j2-1) 


(16) 


Note  that  equation 
r2)  is  replaced  by 


(16)  differ  from  equation 
(1-r)  . 


(3)  only  in  that  (1- 


Substantial  evidence  indicates  equation  (3)  to  be  the  more 
accurate  approximation.  Figure  3  compares  both  equation  (3)  and 
equation  (16)  with  simulation  results  for  a  range  of  CVdj  and 
CVsC  values.  The  simulation  program  used  HE2  and  GEr 

distributional  forms  to  obtain  various  values  for  CVdj  and  CVsC. 
Equation  (3) ' s  predictions  are  significantly  closer  to  the 
simulation  results.  Additionally,  table  I  compares  equations 

(15) ,  (16)  and  (3)  in  four  specific  cases  involving  only 

exponential  and  deterministic  distributions.  For  the  M/M/1  case, 
all  three  agree  and  are  exact.  For  the  M/D/1  case,  equations 

(16)  and  (3)  agree  and  are  correct  [ Pac  ],  while  equation  (15)  is 
when  the  queue  utilization  is  near  one.  For  the 

equation  (15)  again  gives  the  answer  for  very  high 
Equation  (16)  and  (3)  differ  although  they  approach 
as  utilization  approaches  either  one  or  zero.  For 
utilizations,  simulation  verifies  that  equation  (3) 
makes  more  accurate  predictions  of  CVd02  than  does  equation  (16). 
Finally,  for  the  D/D/1  case,  equations  (3)  and  (15)  agree  and  are 
correct,  while  equation  (16)  is  correct  only  as  utilization 
approaches  either  zero  or  one.  Because  equation  (3)  makes  better 
predictions  than  does  equation  (16),  it  is  possible  that  the  more 
general  formula  of  Gelenbe  and  Pujolle  [GP  eg, (13)  ]  would  be 
improved  by  using  (1-r2)  in  place  of  (1-r) . 


correct  only 
D/M/1  case, 
utilization, 
one  another 
intermediate 


All  the  results  in  section  3  lead  to  the  conclusion  that 
transition  processes  in  closed  queueing  networks  tend  toward 
being  Poisson.  High  or  low  variance  service  time  processes  may 
introduce  transition  processes  of  high  or  low  variance  but 
splitting  and  merging  tend  to  direct  the  coefficient  of  variation 
of  the  resulting  process  toward  one. 
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M  •  10 
CV  -  3 


Technique 

Utilization 

Mean  Queue 

Exact 

0.461 

1.068 

Norton 

0.632 

1.134 

ISC] 

0.544 

1.109 

1 2  ah] 

0.460 

1.094 

Eq.  8 

0.541 

1.115 

Eq.  12 

0.538 

1.122 

Technique 

Exact 
Norton 
Eq.  14 


Utilization 

0.573 

0.SS2 

0.S63 


M  -  30 
CV  -  1 


Mean  Queue  Length 
1.570  ^ 

1.581 

1.585 


Figure  6:  Example  2 


Figure  5:  Example  1 


Case 

CVdj2 

CVs02 

CVd02 

Eg.  15  [ RK1  ] 

Eg.  16  [GP] 

Eg.  3 

H/M/1 

1 

1 

1 

1 

1 

M/D/1 

1 

0 

0 

1-r2 

1-r2 

D/M/1 

0 

1 

1 

r 

r2 

D/D/1 

0 

0 

0 

r  (1-r) 

0 

Table  I:  Comparison  of  CVdO^  predicting  methods. 
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Somewhat  contrary  to  intuition,  splitting  then  immediately 
remerging  an  event  stream  according  to  equations  (5)  and  (8)  does 
not  result  in  a  stream  with  the  same  coefficient  of  variation  as 
the  original  stream.  This  situation  can  be  explained  as  follows. 
While  equation  (5)  is  exact,  equation  (8)  is  not.  Equation  (8) 
is  an  approximation  that  characterizes  stochastic  processes 
entirely  by  their  first  two  moments,  even  though  higher  moments 
do  influence  the  result  of  aggregating  streams.  When  we  remerge 
a  stream  that  has  just  been  split,  we  happen  to  be  systematically 
choosing  one  situation  in  which  the  approximation  provide!  by 
equation  (8)  is  very  poor. 


Since  equation  (3) 
single  queue  network,  we 
when  there  are  very  few  c 
customers  in  the  network, 
is  strongly  dependent  on 
queue.  However,  because 
examples  in  order  to  eva 
to  examine  small  networks 
customers  in  the  netwo 
should  be  improved. 


was  derived  in  the  context  of  an  open 
would  expect  it  to  be  least  accurate 
ustomers  in  the  system.  With  only  a  few 
the  arrival  process  at  a  specific  queue 
the  number  of  customers  already  at  that 
we  sought  exact  solutions  for  our 
luate  the  approximations,  we  were  forced 
with  very  few  customers.  With  more 
rk,  the  accuracy  of  the  approximations 


While  we  have  examined  only  a  few  examples,  it  appears  the 
techniques  that  have  been  discussed  for  estimating  transiton 
process  characteristics  can  aid  in  obtaining  better 

approximations  in  the  analysis  of  queueing  network  models.  The 
techniques  suggested  in  this  paper  are  more  accurate,  but  more 

complex  and  expensive  than  those  of  Sauer  and  Chandy  [SC].  At 

the  same  time,  we  belief  that  they  are  simpler  and  more 

economical,  but  less  accurate  than  the  iterative  technique  of 
Chandy,  Herzog,  and  Woo  [CHW2]. 
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THE  INFLUENCE  OF  WORKLOAD  ON  INTERACTIVE + 
RESPONSE  TIME  IN  A  VIRTUAL  MEMORY  SYSTEM 


Satish  K.  Tripathi  and  Kenneth  C.  Sevcik 


ABSTRACT 


Using  an  analytic  queueing  network  model  with  limited 
multiprogramming  level  and  constrained  memory,  we  study  the 
variations  in  interactive  response  times  due  to  changes  in  work 
load.  We  examine  both  open  models  (infinite  user  population)  and 
closed  models  (finite  user  population) .  In  order  to  measure 

response  time  meaningfully,  we  include  the  time  spent  queueing 
for  entry  to  the  multiprogramming  mix.  It  is  found  that  in  open 
systems  and  closed  systems  with  large  user  population,  the 
xpected  time-in-system  is  very  sensitive  to  multi progran ming 
imit  and  a  slight  deviation  from  the  optimal  value  may  have 
substantial  effect.  Our  calculations  are  based  upon  actual 
lifetime  functions  obtained  from  analysis  of  program  reference 
t  races , 
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I.  Introduction 


Many  operating  systems  for  large  computers  permit  application 
programs  to  be  written  as  if  a  large,  fast  random-access  storage 
were  dedicated  to  each  program.  This  illusion  of  ’’virtual 
memory”  TDENNINGTC]  is  supported  by  dynamically  mapping  the 
active  portions  of  each  address  space  to  a  smaller  real  memory. 
The  mapping  is  typically  expressed  in  terms  of  ’’pages”  (physical 
units)  or  ’’segments”  (logical  units)  .  Whenever  a  program 
attempts  to  access  a  portion  of  its  address  space  that  is  not 
currently  located  in  real  memory,  an  interruption  of  processing 
occurs,  and  the  operating  system  alters  the  current  mapping  to 
provide  a  place  in  real  memory  for  the  required  information. 
Clearly,  the  frequency  with  which  such  interruptions  of 
processing  occur  depends  on  how  large  real  memory  is  relative  to 
the  aggregate  size  of  the  address  spaces  of  all  active  programs, 
among  other  facters.  Many  programs  are  allowed  to  run 
concurrently  with  the  intention  of  assuring  high  utilization  of 
critical  resources.  Chanaing  the  memory  mapping  due  to  page 
faults,  however,  involves  extra  information  transfer  and 
processing  overhead.  When  too  many  programs  are  competing  for 
the  use  of  real  memory,  the  advantage  of  having  critical 
resources  highly  utilized  can  be  more  than  negated  by  the 
overhead  that  is  added  to  the  workload. 


In  this  paper,  we  investigate  how  the  interactive  response 
time  depends  on  the  upper  limit  on  the  multiprogramming  level. 
We  will  call  this  the  multiprogramming  limil*  Throughout  the 
paper,  we  will  assume  an  activation  algorithm  that  activates 
arriving  requests  immediately  unless  the  multiprogramming  level 
has  already  reached  its  limit.  We  will  assume  a  paging  system 
operating  on  a  page -on -demand  basis  under  various  page 
replacement  algorithms.  We  will  examine  two  aspects  of  system 
performance:  throughput  (or  system  capacity)  and  mean  response 
time  (or  mean  time-in-system).  Note  that  the  former  performance 
measure  is  of  greater  interest  to  computer  system  management, 
while  the  latter  is  of  greater  interest  to  computer  system  users. 


The  analytic  model  (see  figure  1)  used  in  this  study 
characterizes  requests  by  their  alternating  use  of  the  central 
processing  unit  (CPU)  and  various  input-output  devices,  including 
a  paging  device.  At  the  end  of  each  interval  of  CPU  processing. 
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that  is  exponentially  distributed  with  mean  T  before  submitting 
another  request.  Once  again,  since  the  number  of  requests 
simultaneously  being  served  is  limited,  a  queue  of  users  awaiting 
admission  to  the  multiprogramming  mix  may  form. 


The  analysis  of  both  mode 
First,  a  central  server  model  inc 
analysed  under  various  levels 
multiprogramming  level  (or  load) , 
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respectively  illustrate  the  op 
decomposed  to  permit  analysis. 
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ermined.  These  load-dependent 
load-dependent  service  rates  of 
presents  the  entire  subsystem  in 
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load-dependent  server.  For  the 
is  a  machine  repairman  model 
nt  central  server  and  a  terminal 
N  users.  Figures  2  and  3 
en  and  closed  models  as  they  are 


The  following  notation  will  be  used  throughout  the  paper: 


N:  the  size  of  the  customer  population  (in  the  closed 

system) 

D:  the  upper  limit  on  multiprogramming  level 

u[i]: service  rate  (i.e. ,  inverse  of  mean  service  time)  at 
device  i 

p[ i]: proportion  of  transitions  to  device  i  after  visiting  the 
CPU 

v[i]:average  number  of  visits  per  request  to  device  i 


In  section  2,  we  review  related  work  and  discuss  the 
technique  for  analyzing  the  model.  In  section  3,  we  present  the 
results  for  open  systems  and  then  compare  them,  in  section  4, 
with  results  for  closed  systems.  In  open  systems  throughput  is 
an  inappropriate  performance  measure  and  is  insensitive  to 
multiprogramming  limits  in  many  cases.  We  consider  the  expected 
time-in-system  instead.  It  is  found  that  in  open  systems,  as  well 
as  in  closed  systems  with  large  N,  the  expected  time-in-system  is 
very  sensitive  to  the  deviation  of  the  multiprogramming  limit 
from  the  optimal  value.  This  sensitivity  decreases  with  smaller 
N.  Finally,  in  section  5  we  discuss  the  implications  of  the 
results. 


II,  Related  Work  and  Analysis  of  the  Model 

The  models  used  in  this  study  incorporate  several  features 
that  have  been  studied  previously.  The  central  server  model  and 
an  efficient  computational  algorithm  for  evaluating  performance 
measures  in  exponential  queueing  networks  were  both  developei  by 
Buzen  [BUZEN71].  For  models  with  finite  customer  populations, 
one  of  the  peripheral  servers  in  a  central  server  network  can  be 
used  to  represent  the  delays  between  successive  submissions  by 
the  customer  population.  The  service  rate  of  the  peripheral 
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Figure  1;  Meoory-constrained,  Linited  Multi^rograaaing  Level,  Central  Server  Model. 


Fifura  3:  Slapllfied  Closed  Srstea. 
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[ SCHERR67, SEKIN072 ]. 
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A  few  studies  have  considerd  the  effects  of  limited  memory. 
Both  Buzen  [BUZEN71]  and  Denning  and  Graham  [DENNING75]  have 
suggested  methods  by  which  queueing  network  model  parameters  can 
be  adjusted  to  reflect  changes  in  memory  contention  due  to 
changes  in  multiprogramming  level.  The  necessary  changes  in 
order  to  reflect  a  general  increase  in  paging  rate  are: 


(a)  increase  the  number  of  transitions  per  request  to  the 
paging  device, 

(b)  decrease  the  mean  CPU  hurst  length  in  order  to  hold  the 
total  expected  CPU  requirement  per  request  constant. 

The  extent  to  which  the  number  of  transitions  to  the  paging 
device  per  request  must  be  increased  should  be  determined  from 
the  "lifetime  function"  that  specifies  a  program's  mean  time 
between  page  faults  given  its  allocation  of  real  memory 
[DENNING75].  Most  studies  have  assumed  that  each  request  in  the 
multiprogramming  mix  receives  an  equal  allocation  of  the  system's 
real  memory. 

The  importance  of  the  choice  of  multiprogramming  level  is 
widely  recognized,  and  many  analytic  investigations  of  the 
problem  have  been  reported.  Here,  we  will  discuss  those  that 
have  used  models  sharing  important  characteristics  with  ours. 

Most  previous  studies  of  memory-constrained  models  have 
focussed  on  maximizing  throughput.  Buzen  used  a  single  analytic 
function  involving  three  parameters  to  represent  the  mean  number 
of  memory  references  between  page  faults  as  a  function  of  the 


48 


a 

[ 

s 

P 

c 

a 


vera  ge 
BUZ  EN71 
erver 
rovid 
urren 
re  re 


amou 
1.  He 
model 
for  mu 
multi 
rt  ed. 


ed 

t 

po 


nt  of 
demonst 
to  in 
lae  for 
programm 


real  mem 
rated  the 
vestigate 
ad  gusting 
ing  level 


ory  available  to  e 
propriety  of  using 
memory-constrained 
system  parameters  in 
.  However,  very  few 


ach 

the 

syst 

ligh 

exp 
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maximization  of  throughput  in  virtual  memory  systems 
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Itzhak  and  Heyman  studied  response  times  in  an  open  sys 
Poisson  arrivals  [ AVT-ITZH AK73 ] .  Their  analysis,  which 
similar  to  ours,  involved  two  stages.  First,  a  closed 
model  is  analyzed  at  various  levels  of  multiprogramming 
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The  analysis  done  in  this  paper  involves  a  number  of  the 
techniques  discussed  above.  In  the  closed  model,  decomposition 
is  used  to  separate  the  system  into  two  parts,  the  central  server 
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TIT.  Open  System  Results 

In  an  open  system  that  is  not  saturated,  CPU  utilization  and 
throughput  are  both  fully  determined  by  the  amount  of  work 
arriving  to  the  system.  In  typical  cases,  a  system  will  be  able 
to  handle  a  given  arrival  rate  at  a  number  of  multiprogramming 
levels.  Let  us  define  capacity  as  the  throughput  attained  at  a 
fixed  multiprogramming  limit  assuming  constant  backlog  (i.e.,  the 
activation  queue  is  never  empty).  Figure  4  indicates,  as  a 
function  of  multiprogramming  limit,  the  relationship  between 
throughput  under  a  given  load  and  the  capacity.  (The  parameters 
of  the  system  used  for  all  figures  are  given  in  Appendix  A.)  Note 
that  for  small  multiprogramming  limits,  throughput  under  constant 
backlog  is  an  increasing  function  of  multiprogramming  limit.  In 
this  range,  load  exceeds  throughput  rate,  so  attained  throughput 
equals  capacity  because  the  system  is  always  busy.  As  long  as 
capacity  increases  with  multiprogramming  limit,  more  requests  can 
be  processed  concurrently  without  saturating  any  single  device  in 
the  system,  A  lack  of  congestion  at  the  paging  device  indicates 
that  the  amount  of  real  memory  is  sufficient  to  provide  each 
active  requests  enough  memory  to  operate  without  an  excessive 
page  fault  frequency.  For  medium  values  of  the  multiprogramming 
limit,  the  throughput  is  simply  the  offered  load  to  the  system. 
For  higher  values  of  multiprogramming  limit,  so  many  programs  are 
competing  for  memory  that  the  paging  device  becomes  the  critical 
resource  and  throughput  is  diminished. 
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In  open  systems  throughput  has  a  constant  value  for 
of  multiprogramming  limit.  For  example,  in  figure  4, 
load  throughput  is  constant  between  multiprogramming  limi 
and  ten.  Therefore,  throughput  is  not  a  useful  per 
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IV.  Open  System  Versus  Closed  System  Results 


Let  T  denote  the  mean  time  between  the  completio 
request  from  a  user  and  the  submission  of  his  next  reques 
an  open  system  can  be  thought  of  as  the  limi-'-ing  case  of 
system  where  the  number  of  customers  increases  and  T  is  i 
in  order  to  hold  the  effective  load  on  the  system  c 
Because  we  are  studying  the  effects  of  paging  contenti 
following  approach  to  establish  T  is  appropriate.  In 
make  a  fair  comparison  between  an  open  system  with  arriva 
and  a  closed  system  with  N  customers,  T  should  be  cho 
that  with  unlimited  memory  the  throughput  of  the  closed  s 

as  that  in  the  open 
effective  arrival  rate 
equal  effective  load, 
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fact  that  paging  contention  retards  throughput  will  act  t 
the  effective  load  on  the  system. 
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A  principal  difference  between  open  and  closed  sy 
that  closed  systems  never  saturate  as  open  systems  do,  I 
systems,  the  arrival  rate  is  controlled  by  the  throughp 
The  arrival  rate  is  proportional  to  the  number  of  custom 
already  awaiting  or  receiving  service.  Thus,  as  Ion 
system  throughput  does  not  drop  to  zero,  the  expected 
system  remains  finite  for  all  multiprogramming 
Naturally,  as  N  increases,  however,  the  expected  time-i 
curves  become  steeper  and  steeper  for  multiprogrammin 
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either  above  or  below  the  optimum  (see  figure  6) .  Figure  6a 
compares  the  mean  response  times  obtained  with  various  number  of 
terminals.  Figures  6b  and  6c  present  the  behaviour  of  the  mean 
response  time  for  different  loads  assuming  16  and  128  terminals 
respectively.  Our  results  indicate  that  the  multiprogramming 
limit  that  minimizes  expected  time  in  system  is  independent  of  N. 

Consider  the  problem  of  determining  the  mean  response  time 
conditioned  on  required  service.  Operating  with  a  higher 
multiprogramming  limit  gives  an  advantage  to  shorter  jobs  since 
waiting  in  the  outer  queue  is  reduced  and,  once  activated, 
shorter  jobs  complete  quickly.  For  small  N,  because  the  response 
time  curve  is  flat,  operating  at  a  multiprogramming  limit 
slightly  higher  than  the  optimum  would  not  degrade  the  expected 
time-in-system  significantly.  But,  for  large  N  and,  of  course, 
in  open  systems,  the  response  time  curve  is  steep  and  even  a 
slight  variation  from  the  ootimum  point  will  have  a  significant 
e  f fee t . 


Observe  that  for  any  given  ef 
the  arrival  rate  varies  according 
awaiting  or  receiving  service, 
maintain  a  constant  effective  loa 
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fective  load  in  a  closed  system, 
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As  N  increases,  in  order  -^o 
d,  the  proportion  of  customers 
larger.  In  the  limit  as  N 

rate  is  constant  and  independent 
waiting  service.  Thus,  as  N 

arrival  rate  on  the  number  of 
hes. 


V.  Conclusions 
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em.  Because  throughput  is  restricted  by  the 
lization  is  not  a  useful  performance  measure 
there  is  not  a  constant  backlog.  We 
behaviour  of  time-in-system  for  various 
its  and  loads.  We  also  examine  the 
n  response  time  conditioned  on  service  time, 
behavioux  for  different  values  of  N. 


Our  calculations  have  been  based  on  actual  lifetime  f 
obtained  from  analysis  of  program  reference  traces  [ GP 
The  selection  of  a  page  replacement  algorithm  has  a  sub 
effect  on  the  curves  that  we  have  examined.  Figure  7  c 
the  throughput  and  expected  time  in  system  obtained  with 


unctions 
AHAH76  ]. 
St antial 
on trasts 
the  page 


N-128 


53 


^01  X  BisjsXs-UT-aBii  pajsadxa 


D:  Multiprogramming  Limit 

D:  Multlprograimaing  Limit 

Plguro  6.:  Expected  Time-in-system  in  a  Closed  System  for  Different  N.  Expected  Time-in-.yste.  for  N-16. 
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replacement  algorithms  LRU  and  OPT  [COFFMAN73].  Observe  that  the 
two  curves  have  different  optimum  multiprogramming  limits. 
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APPENDIX  A 

In  all  the  examples  we  use  models  similar  to  the  one  given  in 
figure  1  with  M=3.  Closed  systems  have  a  finite  number  of  users 
N,  and  open  systems  have  an  infinite  number  of  users.  The 
following  values  were  used  to  analyse  the  model. 

u[  1  T  =  .  005 
u[ 2  ]  =  . 005 
u[  3  ]  =  .004 
v[ 1  ]  =  40 
v[  2  1  =  4  0 

Total  memory  size  =  200K 

Average  total  CPU  time  per  job  =  7000 

The  p[i]'s,  i  =  1,2,3,  vfl],  and  u[0]  are  calculated  using  the 
method  described  in  [DENNTNG75].  We  use  LRU  lifetime  function 
for  all  the  examples  except  in  figure  7  where  both  LRU  and  OPT 
are  used.  The  values  of  the  lifetime  curve  obtained  by  Graham 


[ GRAHAM76 ] 
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table  below. 
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250 
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350 
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THE  USE  OF  PERCENTILES  IN  CHARACTERIZING 
SERVICE  TIME  DISTRIBUTIONS'^ 


Edward  D.  Lazowska 


Success  stories  and  "doesn't  matter"  results  to  the  contrary, 
there  are  times  when  representing  the  CPU  as  an  exponential 
server  in  a  queueing  network  model  results  in  unacceptable  error 
in  matching  or  predicting  various  performance  measures.  Should 
this  occur,  the  modeller  will  usually  turn  to  a  first-come- first- 
served  service  discipline  with  a  two-stage  hyperexponential 
server  that  matches  the  mean  and  variance  of  the  observed  service 
time  distribution.  The  third  constraint  required  for  unique 
parameter  selection  is  chosen  arbitrarily,  typically  for 
algebraic  convenience. 

This  paper  develops  a  new  technique  for  matching  general 
service  time  distributions  in  central  server  queueing  network 
models.  I  will  show  that  attempting  to  match  several  higher 
moments  of  these  distributions  is  not  only  insufficient,  but  also 
unnecessary  and  frequently  misleading.  The  magnitude  of  the 
error  will  be  demonstrated  using  a  model  of  the  IBM  System/370 
Model  165  at  the  University  of  Toronto  Computer  Centre.  I  will 
show  that  attempting  to  match  the  empirical  cumulative 
distribution  function  of  the  service  time  distribution  is  a 
superior  approach.  I  will  introduce  a  three-stage  server  with  an 
inherent  density  function  resembling  those  of  empirically 
observed  service  time  distributions,  and  describe  my  experiences 
in  using  a  heuristic  that  selects  parameters  for  this  server 
based  upon  the  location  of  certain  percentiles  of  the  observed 
distribution.  I  will  also  describe  more  sophisticated  parameter 
selection  techniques  that  can  be  effectively  used  to  achieve 
arbitrary  accuracy. 

The  results  in  this  paper  constitute  an  improved  methodology 
for  representing  general  service  time  distributions  in  queueing 
network  models  of  computer  systems. 


This  paper  is  an  expanded  version  of  one  accepted  for  publication 
in  the  Proceedings  of  the  1977  International  Symposium  on  Computer 
Performance  Modeling,  Measurement  and  Evaluation,  August  16-18, 
1977,  Yorktown  Heights,  New  York,  U.S.A.,  and  constitutes  a  chapter 
of  the  author's  forthcoming  Ph.D.  dissertation. 
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The  Occasional  Inadequacy  of  Exponential  Servers 


Simple  queueing  network  models  using  exponential  servers  have 
demonstrated  a  remarkable  ability  to  predict  computer  system 
performance  accurately.  It  is  an  unfortunate  fact  of  life, 
though,  that  significant  errors  sometimes  arise  in  using  them  to 
model  even  seemingly  straightforward  computer  systems.  An 
insufficiently  accurate  characterization  of  the  CPU  service  time 
distribution  may  be  the  principal  cause  of  this  error. 


Throughout  this  paper  I  shall  make  reference  to  a  queueing 
network  model  of  the  IBM  System/370  Model  165  at  the  University 
of  Toronto  Computer  Centre.  We  shall  study  the  predictions  of 
this  model  when  various  server  structures  are  used  to  represent 
the  CPU  service  time  distribution.  The  basic  model  was 
constructed  with  considerable  care.  It  is  of  the  central  server 
type,  with  eight  exponential  servers  representing  the  major  I/O 
devices  of  the  actual  system.  Its  parameters  were  determined 
using  a  combination  of  hardware  and  software  monitor  data  and 
accounting  data.  Particular  attention  was  paid  to  the  effect  of 
parallelism  in  the  I/O  subsystem  [Zahorjan  1976],  This  subsystem 
model,  once  developed,  was  reduced  to  a  single  load-dependent 
server  via  Norton's  theorem  [Chandy  et  al.  1975].  The  structure 
of  the  model  is  illustrated  in  Figure  1  on  the  following  page. 


Our  standard  of  comparison  will  not  be  the  actual  system,  but 
a  detailed  simulation  model  in  which  the  CPU  service  time 
distribution  was  obtained  by  processing  full  interrupt  traces  of 
the  actual  system  in  operation.  Since  this  distribution  includes 
CPU  activity  attributable  to  the  operating  system,  overhead  is 
implicitly  represented.  The  performance  measure  used  throughout 
is  CPU  utilization,  typically  the  most  robust  metric. 


The  departure  point  for  this  study  is  the  observation 
with  a  multiprogramming  level  of  5,  the  CPU  utilization  of 
Toronto  system  is  ,74.  Using  an  exponential  CPU  server  mate 
the  observed  mean,  the  queueing  network  model  predicts 
utilization  of  .83,  an  unacceptably  large  error. 
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variation  of  empirically  observed  CPU  service  time  distributions 
is  greater  than  one,  this  server  is  the  two-stage 
hy perexponential  (HE-2),  illustrated  in  Figure  2. 


Figure  2 


In  attempting  to  match  the  mean  and  variance  of  an  observed 
distribution  with  a  two-stage  hyperexponential  server,  we  arrive 
net  at  a  single  set  of  parameters,  but  at  a  family  of  parameter 
values  constrained  by  the  following  two  equations: 


mean  =  (P)T1  +  (1-P) T2 
variance  =  2((P)T12  +  (1-P)T22)  -  mean^ 


where  T1  is  the  mean  service  time  of  stage  1 ,  P  is  the  selection 
probability  of  that  stage,  and  T2  is  the  mean  service  time  of 
stage  2. 

Based  on  intuition  gained  from  the  Pollaczek- Khinchin 
guation,  it  is  surprising  that  when  the  various  members  of  this 
amily  are  used  in  a  queueing  network  model  they  yield 

ramatically  different  predictions  for  CPU  utilization.  For 
example,  when  the  family  having  a  mean  of  8.7  and  a  coefficient 
of  variation  of  12.7  (the  observed  characteristics  of  the  Toronto 
CPU  service  time  distribution)  is  used  in  our  model,  the 
approximate  range  of  predicted  CPU  utilizations  is: 


p  22 

0005  3454.06  6.95 

•  •  • 

0124  701.16  0.00+ 


Mean 

Utilization 

8.7 

12.7 

.79 

• 

• 

• 

00 

• 

12.7 

.56 

The  range  of  values  that  P  can  assume  is  constrained  by  the 
variance  of  the  distribution.  P  was  not  pushed  to  the  absolute 
extreme  in  the  above  example,  nor,  of  course,  was  the  example 
artificially  constructed  to  achieve  maximum  effect. 
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There  is  a  convincing  intuitive  explanation  for  this 
phenomenon.  For  arbitrary  but  fixed  mean  and  variance.  Figure  3 
illustrates  the  behaviour  of  T1  and  T2  as  P  is  varied  over  its 
feasible  range •  (Note  that  the  y-axis  scales  for  T1  and  T2 
differ.)  The  value  of  T2  varies  almost  linearly  from  the  mean  of 
the  server  (asymptotically,  when  P  is  extremely  small)  to  zero 
(again  asymptotically,  as  P  approaches  its  upper  feasible  limit). 
The  value  of  T1  asymptotes  towards  the  mean  of  the  server  as  P 
approaches  its  upper  feasible  limit. 


Figure  4 


Figure  3 


Figure  4  illustrates  the  relative  contribution  of  each  stage 
to  the  overall  mean  of  the  server  as  P  is  varied  over  its 
feasible  range.  The  relative  contribution  of  stage  1  is  (P)T1 
divided  by  the  mean  of  the  server;  for  stage  2,  the  numerator  is 
(1-P)T2.  Notice  that  for  P  near  the  upper  feasible  limit,  the 
contribution  of  stage  2  to  the  overall  mean  is  negligible.  In 
other  words,  essentially  the  entire  utilization  of  the  CPU  is 
attributable  to  a  stage  whose  selection  probability  is  «C1.  With 
a  probability  of  .99,  a  customer  will  have  a  negligible  service 
time  at  the  CPU  server  and  will  proceed  immediately  to  the  I/O 
servers.  I/O  queue  lengths  will  grow,  and  this  congestion  will 
decrease  system  throughput  and  CPU  utilization. 

As  P  decreases,  the  relative  contributions  of  the  two  stages 
first  become  equal,  then  stage  2  dominates  until  finally,  as  P 
approaches  zero,  almost  all  of  the  CPU  activity  is  attributable 
to  stage  2.  But  this  stage  now  has  a  selection  probability  of 
nearly  1,  so  CPU  utilization  is  nearly  as  high  as  for  an 
exponential  server. 

This  observation  explains  another  phenomenon  of  the  HE-2  CPU 
server:  as  its  coefficient  of  variation  decreases,  so  does  the 
range  of  CPU  utilizations  that  can  be  attained.  For  instance, 
using  the  same  model  with  a  coefficient  of  variation  of  2  instead 
of  12.7  results  in  the  following  range  of  predicted  CPU 
utilizations: 
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its 

selection 

probatility  reaches  .4.  There  is  greater  inherent  balance  in 
this  server,  so  queue  lengths  at  the  I/O  servers  do  not  become 
excessive,  and  CPU  utilization  is  not  as  severely  affected. 
Note,  incidentally,  that  the  actual  CPU  utilization  of  the  system 
is  barely  included  in  the  range  of  this  server. 

The  two  extremes  of  an  HE-2  family  may  be  distinguished  more 
formally  by  considering  their  probability  density  functions. 
Figures  5  and  6  display  the  general  forms  of  the  density 
functions  corresponding  to  HE-2  servers  at  the  low  and  high 
extremes  cf  predicted  CPU  utilization,  respectively: 


Figure  5  Figure  6 

In  Figure  5,  corresponding  to  the  low  utilization  server,  the 
percentiles  of  the  distribution  are  concentrated  near  the  origin. 
The  density  function  corresponding  to  the  high  utilization  server 
differs  remarkably,  considering  that  the  first  two  moments  of  the 
distributions  are  identical.  Its  percentiles  are  spread  over  a 
significantly  greater  range. 

Since  the  actual  CPU  utilization  is  included  in  the  range 
predicted  by  the  HE-2  family  matching  the  observed  mean  and 
variance,  it  is  natural  to  seek  a  third  constraint  that  will 
allow  us  to  select  the  appropriate  set  of  parameters.  Two 
approaches  have  been  used  in  practice.  The  more  common  [Sauer  S 
Chandy  1975]  attempts  to  achieve  a  balanced  server,  in  the  sense 
of  Figure  4,  by  setting  {P)T1  =  (1-P)T2.  This  constraint  is 
somewhat  arbitrary,  and  when  the  HE-2  server  satisfying  it  is 
used  in  the  model,  predicted  CPU  utilization  is  .69. 


64 


The  more  conscientious 
moment  of  the  observed  CPU  se 
to  the  first  two.  Although 
this  match  with  an  HE-2  serve 
Toronto  system  can  be  matched 
predicted  CPU  utilization  is 
HE-2  server  with  fixed 
utilization  is  proportional 
illustrated  in  Figure  6  h 
illustrated  in  Figure  5, 
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The  following  table  summar; 
results  to  this  point: 


.zes  our  rather  discouraging 


server 


u tili zation 


percent  error 


system 

exponential 

HF-2 


.  74 
.  83 


+  12 


low  extreme  of 

the  family 

.  56 

-24 

high  extreme  of 

the  family 

.79 

+  7 

matching  coef.  of 

skewness 

.  64 

-13 

equal  relative 

contribution 

.  69 

-  7 

(The  coefficient  of  skewness 

eg  uals 

the  cube  root  of  the 

skewness,  divided  by  the  square 

root  of 

the  variance.) 

In  short. 

matching  the  first  few  moments  of  the 

observed  CPU  service  time 

distribution  using  a  two-stage 

hyperexponential  server 

does  not 

seem  to  be  a  fruitful  approach. 

unless 

the  modeller  is 

willing  to 

resort  to  calibration  of  the 

model 

via  arbitrary 

parameter 

mcdif ica tion . 
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Jaiswal  derives  an  expression  for  server  utilization  in  the 
M/G/1  queueing  system  with  fixed  number  of  customers  which  shows 
that  utilization  is  inversely  proportional  to  the  Laplace 
transform  of  the  CPU  service  time  distribution.  (In  Jaiswal's 
result,  the  transform  function  is  evaluated  at  several  fixed 
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values.)  Price  notes  that  since  the  Laplace  transform  encodes 
all  moments  of  a  distribution,  there  is  no  reason  to  believe  that 
the  first  few  moments  restrict  the  range  of  the  value  of  the 
transform  sufficiently  to  ensure  accuracy.  provides  an  HE^2 
example  illustrating  this  point. 

The  implication  of  Price’s  observation  is  important  and  has 
not  received  the  attention  it  deserves:  to  ensure  accuracy  in 
the  M/G/1  queueing  system  with  fixed  number  of  customers,  it  is 
sufficient  that  the  server  have  the  same  Laplace  transform  value 
at  certain  specific  points  as  the  observed  service  time 
d  istr ibuticn . 

The  Laplace  transform  of  a  service  time  distribution  is  the 
integral  of  the  product  of  its  probability  density  function  and 
the  negative  exponential  function: 


F*[  s  1 

f  (X)  dx 

Price’s  observation  rings 

true. 
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e  this  observation  can  be  used  in  building  queueing 
odels,  we  must  assess  its  robustness  by  answering  two 
:  (1)  Does  the  Laplace  transform  result  hold  for 
systems  other  than  M/G/1  with  fixed  number  of  customers? 
a  practical  matter,  is  it  any  easier  to  characterize  a 
ion  accurately  in  terms  of  its  Laplace  transform  than  in 
its  moments?  I  shall  address  each  of  these  questions  in 


•  Applicability  to  Other  Queueing  Network  Configurations 


The  M/G/1  queueing  system  with  fixed  number  of  customers  is  a 
reasonable  high-level  model  of  a  time  sharing  system:  the 
central  server  represents  the  computer  system  itself,  and  the  N 
I/O  servers  represent  interactive  terminals.  For  more  detailed 
studies,  though,  the  model  is  not  sufficiently  realistic.  For 
instance,  should  we  desire  to  represent  a  CPU  and  its  I/O 
servers,  we  are  forced  to  assume  that  all  I/O  servers  have  the 
same  service  time  distribution  and  that  each  I/O  server  is 
uniquely  assigned  to  a  single  customer. 


In  observing  the  behaviour  of  this  queueing  network,  it  is 
apparent  that  the  magnitude  of  the  Laplace  transform  effect 
(i.e.  ,  the  width  of  the  range  of  CPU  utilizations  predicted  by 
the  various  members  of  an  HE-2  family)  is  dependent  upon  the 
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relative  load  of  the  CPU 
the  greater  the  extent  to 
the  greater  the  degree 
affects  performance. 


and  the  I/O  devices.  This  makes  sense; 
which  the  CPU  is  the  bottleneck  device, 
to  which  its  service  time  distribution 


For  this  reason,  it  is  reasonable  to  expect  that  I/O  queueing 
may  substantially  diminish  the  magnitude  of  the  effect.  In  order 
to  establish  the  generality  of  the  result,  I  will  now  consider 
the  other  extreme  of  the  queueing  network  spectrum:  a  system 
with  a  CPU  and  a  single  I/O  server. 
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the  M/G/1  queueing  system  with  bounded  queue  size, 
queueing  system,  customers  arrive  at  a  fixed  rate 
pendent  of  the  number  of  customers  already  in  the 
arriving  customer  finds  N  customers  already  in  the 
er,  it  balks  and  vanishes.  This  queueing  system  is 
a  queueing  network  having  N  customers,  a  FCFS  CPU 
service  time  distribution,  and  a  single, 
distributed  I/O  server.  (As  with  Price* s 

the  equivalence  of  these  two  systems  may  not  be 
vious.  Once  again,  though,  their  state  transition 
s  are  identical.  Although  the  open  system  always 
arrival  process,  new  arrivals  are  rejected  if  there 
N  customers  in  the  system,  so  the  effective  arrival 
uring  these  periods.  This  behaviour  is  identical 
closed  system, ) 


Of  course,  most  computer  systems  lie  som 
extremes  delimited  by  the  two  M/G/1  queueing 
described:  there  is  more  than  one  I/O  server, 

sc  many  that  queueing  never  occurs.  A  proof 
transform  result  for  the  H/G/1  queueing  system 
size  would  argue  convincingly  that  the  result  i 
entire  spectrum,  since  we  intuitively  expect 
queueing  to  be  the  crucial  factor  in  determinin 
the  form  of  the  CPU  service  time  distribution 
performance . 
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This  proof,  along  with  some  motivational  material,  is 
included  as  an  appendix  tc  this  paper.  In  the  remainder  of  this 
section  I  shall  emphasize  the  magnitude  of  the  effect.  Figure  7 
illustrates  the  range  of  CPU  utilizations  yielded  by  the  HE-2 
family  with  a  mean  of  8.7  and  a  coefficient  of  variation  of  12,7 
used  in  the  two-server  queueing  network  model  equivalent  to  the 
M/G/1  queueing  system  with  bounded  queue  size.  The  range  is 
graphed  against  relative  I/O  power,  expressed  as  the  ratio  of  CPU 
service  time  to  I/O  service  time.  (Again,  it  is  not  surprising 
that  both  the  CPU  utilization  and  the  range  decrease  as  the  I/O 
server  becomes  the  bottleneck  device.) 
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Figure  7 


Figure  8 


Figure  8  illustrates  another  interesting  phenomenon, 
shows  the  range  of  predicted  utilizations  as  a  function 
multiprogramming  level  (N)  for  fixed  I/O  service  time, 
graph  is  similar  to  one  that  Price  plots  for  the  queueing  net 
equivalent  to  the  M/G/1  queueing  system  with  fixed  numbe 
customers.  This  phenomenon  is  well  supported  intuitively. 
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Results  similar  to  those  shown  in  Figures  7  and  8  can  be 
demonstrated  for  queueing  network  models  in  the  middle  of  the 
spectrum:  those  with  a  fixed  number  of  non-identical  I/O  devices 
at  which  queueing  occurs.  For  instance,  the  model  of  the 
University  of  Toronto  Computer  Centre  system  exhibits  all  of  the 
characteristics  of  the  Laplace  transform  effect.  Given  the 
existence  of  these  phenomena  throughout  the  spectrum,  along  with 
proofs  of  the  Laplace  transform  result  at  either  extreme,  we 
confidently  assert  that  the  Laplace  transform  result  applies 
throughout . 


Characterizing  Distributions 


in  Terms  of  Laplace  Transforms 
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specifies  a  value  rather  than  a  function.  Because  the  mass  of 
the  Figure  5  density  function  is  concentrated  near  the  origin, 
the  area  under  the  product  curve  of  this  function  and  the 
negative  exponential  will  be  quite  large,  and  predicted  CPU 
utilization  will  be  quite  low.  The  percentiles  of  the  Figure  6 
density  function  are  spread  over  a  wider  range,  and  the  area 
under  the  corresponding  product  curve  will  be  quite  small, 
leading  to  higher  predicted  utilization. 

The  key  to  the  usefulness  of  the  Laplace  transform  result 
lies  in  the  negative  exponential.  Just  as  values  of  the  density 
function  are  multiplied  by  factors  whose  values  decrease 
exponentially  with  distance  from  the  origin,  so  the  significance 
of  any  error  in  the  specification  of  the  density  function 
decreases  exponentially  with  distance  from  the  origin.  In  other 
words,  we  can  tolerate  large  errors  in  the  specification  of  the 
density  function,  f(x),  for  large  x,  and  still  arrive  at  a 
Laplace  transform  value  extremely  close  to  that  of  the  observed 
d istr ibut ion. 


Matching  Percentiles 


In  characterizing  an  observed  service  time  distribution  in 
terras  of  its  Laplace  transform,  we  are  concerned  more  with 
identifying  the  area  in  which  its  mass  is  concentrated  than  with 
the  specific  density  at  a  few  selected  points.  For  this  reason, 
the  cumulative  distribution  function,  F(x),  which  expresses  the 
location  of  the  various  percentiles  of  the  distribution,  is  a 
mere  appropriate  measure  than  the  probability  density  function, 
f  (X) .  The  characteristics  of  the  Toronto  CPU  service  time 
distribution  (expressed  in  milliseconds)  are: 

coefficient  coefficient  percentiles 

mean  of _ variation  of  skewness  JOth  25th  5Cth  7_^h  90th 

8.  70  12.7  3.0  1.2  1.9  3.8  6.8  10.9 

Despite  its  relatively  high  coefficient  of  variation,  the 
observed  distribution  has  a  median  (we  shall  use  the  50th 
percentile  for  comparative  purposes)  that  lies  quite  far  from  the 
origin.  The  value  of  its  Laplace  transform  is  therefore  fairly 
lew,  and  the  CPU  utilization  fairly  high.  This  statement  is 
supported  by  the  observation  that  using  the  HE-2  server  at  the 
high  utilization  extreme  of  the  family  matching  the  first  two 
moments  of  the  observed  distribution  (the  HE-2  server  with  a 
density  function  resembling  that  of  Figure  6)  results  in  a 
predicted  utilization  only  slightly  greater  than  the  observed 
value . 

To  understand  the  errors  that  result  when  various  HE-2 
servers  are  used  in  the  model,  notice  that  the  observed 
distribution  has  a  relatively  low  coefficient  of  skewness. 
Within  an  HE-^2  family,  skewness  is  proportional  to  the  location 
of  the  median  (and  thus  inversely  proportional  to  P) .  For  the 
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family  with  the  observed  mean  and  variance, 
for  the  coefficient  of  skewness  runs  from  2 
the  parameters  by  matching  skewness  commits 
with  a  relatively  low  median,  high  Laplac 
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I  have  developed  a  simple  heuristic  that  attempts  to  match 
the  mean  and  several  percentiles  of  an  observed  distribution  by 
varying  the  parameters  of  the  server  illustrated  in  Figure  10, 
There  are  thus  three  free  parameters;  the  fourth  parameter  is 
adjusted  to  maintain  the  mean.  The  heuristic  biases  in  favour  of 
percentiles  located  near  the  origin,  both  in  selecting  those 
percentiles  to  be  matched  and  in  setting  the  error  tolerance. 
Although  approximating  a  density  function  by  means  of  percentiles 
genera  lly  requires  that  a  large  number  of  percentiles  be 
specified  where  the  slope  of  the  density  function  differs  most 
from  zero,  the  fact  that  we  are  working  with  a  server  whose 
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distribution  has  certain  inherent  nice  properties  (e.g,, 
continuity)  minimizes  the  impact  of  this  requirement. 

The  heuristic  used  to  match  F (x)  is  not  interesting  enough  to 
describe  in  detail.  (A  more  sophisticated  approach^  which 
attempts  to  match  the  Laplace  transform  of  the  observed 
distribution  directly,  will  be  discussed  in  a  subsequent 
section.)  The  value  of  x  at  which  a  particular  cumulative 
percentage  occurs  is  monotone  in  each  free  parameter  except  at 
one  extreme.  It  is  important  to  realize  that  we  are  not  solving 
a  linear  system;  the  existence  of  three  free  parameters  does  not 
imply  that  three  cumulative  percentages  can  be  matched.  For  some 
observed  distributions,  five  points  can  be  matched,  each  to 
within  for  others,  the  best  fit  to  three  points  has  a  maximum 
error  in  excess  of  15%.  For  a  particular  cumulative  percentage 
of  the  observed  distribution,  F(x),  the  error  is  measured  as  the 
difference  between  F (x)  and  the  cumulative  percentage  that 
results  when  the  cumulative  distribution  function  of  the  server 
is  evaluated  at  x. 

The  observed  distribution  of  the  Toronto  system  can  be 
matched  quite  well  by  the  server.  When  a  three-percentile  fit  is 
attempted,  the  x-values  corresponding  to  the  25th,  50th  and  75th 
percentiles  yield  cumulative  percentages  of  .252,  .518  and  .722, 
respectively.  At  the  expense  of  some  accuracy  at  these 
percentiles,  we  can  achieve  a  five- percentile  fit  that  includes 
the  15th  and  9Cth. 

When  the  model  is  evaluated  using  either  of  these  servers, 
the  predicted  utilization  of  the  CPU  is  .75,  an  error  of  roughly 
1 


Further  Experiments 


A  number  of  experiments  have  been  performed  that  add  weight 
to  the  success  and  potential  of  this  new  approach  to 
characterizing  service  time  distributions: 


•  Matching  the  Percentiles  of  Other  Observed  Distributions 

In  describing  the  Toronto  CPU  service  time  distribution  we 
described  a  contrasting  distribution:  one  with  low  median,  high 
skewness,  and  low  utilization.  Such  a  distribution  was 
constructed,  with  mean  and  variance  reasonably  close  to  those  of 
the  observed  distribution.  Its  characteristics  are: 


coefficient 
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percentiles 

mean 

of  variation 
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Jlllth 

2^th  50th  75th  90th 

8.  73 

11.3 

4.  5 

0.  5 

0.6  0.9  1.2  1.7 

I  shall 
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as  D2 

,  and  to  the  observed 

distribution  as  D1. 
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When  D2  is  employed  in  the  simulation  model,  CPU  utilization 
is  .61.  The  mean  and  variance  of  D2  are  sufficiently  close  to 
those  of  D1  that  the  utilizations  predicted  by  the  various  HE-2 
servers  can  be  taken  from  the  table  that  appeared  earlier.  No 
HE-2  server  is  able  to  match  all  of  the  first  three  moments  of 
this  distribution;  the  high  extreme  of  the  family  matching  the 
first  two  moments  comes  closest,  and  results  in  an  error  of  30% 
in  predicted  CPU  utilization. 
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further  work  must  be  done  to  determine  tolerable 
the  general  robustness  of  the  technique  is  evident. 


•  Matching  Percentiles  Using  an  HE-2  Server 

Because  the  density  function  of  an  HE-2  server  has  its 
greatest  value  at  the  origin,  it  is  impossible  to  obtain  a  close 
fit  to  typical  CPU  service  time  distributions,  which  have  zero 
density  there.  Nonetheless,  I  have  attempted  to  match  the  mean 
and  three  percentiles  of  both  D1  and  D2  using  HE-2  servers. 

For  the  HE-2  matching  D1,  the  x-values  corresponding  to  the 
10th,  25th  and  50th  percentiles  yield  cumulative  percentages  of 
.169,  .253  and  .443,  respectively.  Surprisingly,  when  the  model 
is  evaluated  using  this  server,  CPU  utilization  is  .77,  an  error 
of  only  4%. 


For  the  HE-2  matching  D2,  the  respective  cumulative 
percentages  are  .283,  .330  and  .451.  Predicted  CPU  utilization 
using  this  server  is  .60,  an  error  of  less  than  2%. 

These  two  experiments  strongly  support  the  robustness  of  the 
technique.  Note,  incidentally,  that  the  error  in  matching  D2  is 
not  as  severe  as  appears  at  first  glance.  The  slope  of  the 
cumulative  distribution  function  is  so  great  in  this  range  that 
the  x-value  of  each  percentile  and  the  value  of  the  density 
function  there  are  only- slightly  in  error. 


•  The  Apparent  Unimportance  of  Variance 

There  is  only  a  limited  correlation  between  the  variance  of 
the  observed  CPU  service  time  distribution  and  the  variance  of 
the  servers  that  yield  the  correct  CPU  utilization. 

First  consider  D1,  which  has  a  coefficient  of  variation  of 
12.7.  The  three-stage  server  matching  three  percentiles  has  a 
coefficient  of  variation  of  2.0,  while  the  one  matching  five 
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D2  has  a  coefficient  of  variation  of  11.3. 
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I  do  not  mean  to  imply  that  predicted  CPU  utilizatio 
unrelated  to  the  coefficient  of  variation  of  the  CPU  server, 
two  are  inversely  proportional  in  the  following  limited  sense 


Consider  a  queueing  network  model  in  which  the  CP 
represented  by  an  HE-2  server  with  mean  M  and  coefficient 
variation  V,  Let  the  predicted  CPU  utilization  of  this  mode 
U.  There  exists  an  HE-2  server  with  mean  M’  =  M  and  coeffic 
of  variation  V’  <  V  that,  when  used  in  the  model,  results 
predicted  CPU  utilization  U*  >  U. 
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Buzen  [1977]  has  explained  this  phenomenon  in  terms  of  a 
"limited  damage"  argument:  in  a  closed  queueing  network,  the 
performance  degradation  resulting  from  a  customer  with  an 
extremely  long  service  time  is  bounded,  because  at  most  K-1 
customers  (where  N  is  the  multiprogramming  level)  will  queue 
behind  that  customer.  Thus  the  transient  behaviour  of  such  a 
system  is  much  less  volatile  than  that  of  a  similar  open  system. 

This  observation  lies  at  the  heart  of  the  results  in  this 
paper.  In  essence,  the  moments  of  a  distribution  are  strongly 
influenced  by  its  tail,  while  the  Laplace  transform  of  a 
distribution  tends  to  emphasize  its  form  near  the  origin.  Thus 
in  a  closed  queueing  network,  the  Laplace  transform  provides  a 
superior  characterization  of  the  service  time  distribution. 
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Parameter  Selection  Techniques  of  Greater  Sophistication 


Based  upon  the  results  of  this  paper,  our  objective  in 
representing  the  CPU  in  a  central  server  queueing  network  model 
must  be  to  select  a  server  with  Laplace  transform  values  equal  to 
those  of  the  observed  service  time  distribution.  At  one  extreme 
of  the  queueing  network  spectrum  (equivalent  to  the  M/G/1 
queueing  system  with  bounded  queue  size)  the  transform  is 
evaluated  at  a  single  point,  L  (the  inverse  of  the  mean  I/O 
service  time) .  At  the  other  extreme  (equivalent  to  the  M/G/1 
queueing  system  with  a  fixed  number  of  customers)  the  transform 
is  evaluated  at  the  N  points  nL,  0<n<N-1,  where  N  is  the  number 
of  customers  in  the  system  and  I,  once  again,  is  the  inverse  of 
the  mean  I/O  service  time. 

The  simple  heuristic  used  for  parameter  selection  in  the 
modelling  experiments  of  previous  sections  can  best  be  described 
as  an  indirect  approach  to  this  objective,  since  it  seeks  to 
match  the  Laplace  transform  by  matching  the  percentiles  of  the 
observed  distribution  lying  near  the  origin.  In  this  section,  I 
shall  describe  two  alternative  approaches  to  parameter  selection. 
The  first  is  an  improved  indirect  method,  based  on  recent  work  by 
Eux  and  Herzog.  The  second  is  a  direct  method;  in  other  words, 
it  selects  a  server  on  the  basis  of  its  Laplace  transform.  The 
goal  of  this  section  is  to  specify  concretely  a  parameter 
selection  methodology  that  exploits  the  results  of  the  paper. 


•  Numerical  Evaluation  of  the  Laplace  Transform 

Regardless  of  whether  an  indirect  or  direct  parameter 
selection  technique  is  employed,  the  first  step  is  to  numerically 
evaluate  the  Laplace  transform  of  the  observed  service  timf 
d istr ibution. 


In  general,  this  task  can  be  rather  tricky.  We  can  eliminate 
a  large  measure  of  the  difficulty  by  working  in  terms  of  the 
cumulative  distribution  function  of  the  service  time 
distribution,  F(x),  since  it  is  both  smoother  and  more  easily 
measured  than  the  probability  density  function,  f  (x) .  This  is 
easily  done,  for  if  A*[s]  is  the  Laplace  transform  of  F(x),  then 
sA*[s]  is  the  Laplace  transform  of  f  (x)  . 


Our  task,  then,  is  to  determine  the  value  of  the  Laplace 
transform  of  the  service  time  distribution 


n®®  -sx 

B*[  si  =  s  j  e  F  (x)  dx 


for  certain  values  of  s.  We 
to  evaluate  the  integral,  as 


shall  use  composite  Gauss  quadrature 
follows: 


□ 


Determine  the  values  of  s  that  must  be  considered. 
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n  Find  the  upper  limit  of 
ensure  sufficient  accuracy  over 


integration  that  is  required  to 
this  range  of  s. 


n  Divide  the  interval  of  integration  into  sub-intervals,  and 
use  a  low-crder  Gauss  formula  to  integrate  each  sub-interval. 

The  smallest  non-zero  value  of  s,  namely  L,  will  require  the 
largest  interval  of  integration,  since  B=<‘[  0  ]  equals  one 
regardless  of  the  form  of  F  (x) ,  Knowing  L  and  the  rate  at  which 
F  is  asymptotic  to  one,  a  safe  upper  limit  for  the  interval  can 
easily  be  determined.  The  width  of  the  sub-intervals  can  be 
allowed  to  increase  with  increasing  x. 
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ractice,  I  do  not  concern  myself  with  tuning  the 
cn  to  the  specific  system  being  analyzed  (except,  of 
for  scaling  with  respect  to  the  mean).  I  generally  use 
intervals,  each  twice  the  width  of  the  one  preceding  it. 
uires  that  F  (x )  be  specified  at  fifteen  points  when  a 
nt  Gauss  formula  is  employed  in  each  sub-interval.  For 
ice  time  distributions  and  ranges  of  s  that  I  have 
ed,  this  provides  accuracy  to  the  fourth  decimal  place, 
vely,  should  existing  measurement  data  specify  F(x)  at 
mined  values,  the  sub-intervals  can  be  selected  to 
In  either  case,  a  modicum  of  cleverness  in  programming 
ration  makes  it  possible  to  determine  the  accuracy  of 
ion,  as  well  as  its  sensitivity  to  the  particular  points 
F  (x)  is  specified. 
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s  an  example,  the  following  table  displays  the  calculated 
s  of  the  Laplace  transform  of  D1,  the  observed  CPD  service 
distribution  of  the  University  of  Toronto  Computer  Centre 
m.  The  transform  is  evaluated  at  points  corresponding  to 
throughput  rates  of  the  I/O  subsystem  with  zero  to  four 
mers  present: 


number  of 


customers 

s 

0 

0 

1 

1 

.  0474 

.7482 

2 

.  0813 

.  6885 

3 

.  1063 

.  6304 

4 

.  1250 

.5902 

To  evaluate  the  transform  at 

twenty 

points  requires  one 

second  of 

CPU  time  using  a  Gauss-Legendre  qua 

drature  routine  from 

the  STUNT 

package  [ Johnston  S  Addison 

1977  ]. 
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•  Matching  Percentiles  Using  Linear  and  Non-linear  Programming 

In  a  recent  paper,  Bux  and  Herzog  [1977]  describe  a  technigue 
that  selects  parameters  for  a  restricted  form  of  the  general  Cox 
server  based  upon  the  mean,  the  second  moment,  and  an  arbitrary 
number  of  points  on  the  cumulative  distribution  function  of  an 
observed  service  time  distribution. 


Their 
exponenti 
time,  T. 
customer 
service  a 


server,  illustrated  in  Figure  11,  consists  of  F 
al  stages  in  series,  each  with  the  same  mean  service 
After  completing  service  at  stage  r,  1<r<E-1,  a 
either  leaves  the  server,  with  probability  Pr,  or  enters 
t  stage  rt1 ,  with  probability  1-Pr. 


Figure  11 


Bux  and  Herzog  first  demonstrate  that  this  server  possesses 
the  important  property  that  it  can  approximate  any  service  time 
distribution  arbitrarily  closely.  They  then  outline  an  algorithm 
to  perform  the  following  approximation  task: 


Given  the  J  values  of  the  observed  service  time 
distribution's  cumulative  distribution  function,  F(x),  evaluated 
at  points  x  ,  ...  ,x  ,  as  well  as  the  first  and  second  moments  of 


1  J 


the  service  time  distribution,  E[ F  ]  and  E[F2],  select  parameters 
F,  T  and  Pr  (1<r<R-1)  for  the  server  illustrated  in  Figure  11 
such  that  the  following  constraints  are  satisfied  ty  its 
cumulative  distribution  function,  G  (x) : 


G(X  ) 

j 


<  F  (X  )  +  (P 

j  j 


1<  j<J 


G(x  )  >  F(x  )  -  1<j<J 

j  j  D 

E[G]  =  E[F] 


E[G2  ]  =  E[  F2  ] 

where  the  J  values  of  (|)  and  are  specified. 

As  we  have  discussed  in  a  previous  section,  this 
approximation  task  is  not  a  linear  problem.  Bux  and  Herzog 
proceed  as  follows: 

□  Set  R  to  the  minimum  possible  value,  based  upon  the  first  and 
second  moments  of  the  observed  service  time  distribution. 
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n  For  a  particular  value  of  T,  use  a  linear  optimization 
technique  to  select  the  optimum  values  of  the  Pr. 

n  Find  the  optimum  value  of  T  by  using  a  non-linear 
optimization  technique. 

n  If  the  constraints  are  not  satisfied,  increment  R  by  one  and 
iterate. 
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•  Matching  the  Laplace  Transform  Directly 

Since  the  effect  of  the  CPU  service  time  distribut 
central  server  queueing  network  model  is  entirely  capture 
value  of  its  Laplace  transform  at  certain  points, 
straightforward  means  of  characterizing  an  observed  dist 
is  to  select  a  server  directly  on  the  basis  of  its  corre 
Laplace  transform  values.  One  major  advantage  of  this 
is  the  direct  correlation  between  the  accuracy  of  the 
transform  approximation  and  the  accuracy  o-f  the  server  wh 
in  the  queueing  network  model. 

The  Laplace  transform  of  any  Cox-type  server  i 
expressed  in  terms  of  its  parameters.  For  illustrative  p 
we  return  to  the  three-stage  server  shown  in  Figure  10,  w 
the  proper  balance  of  simplicity  and  sufficiency.  Its 
transform  is 

1/T1  P/T2  (1-P)/T3 

C*[s]  =  -  {  -  +  -  } 

s  +  1/T1  s  +  1/T2  s  +  1/T3 

We  shall  attempt  to  select  parameters  Tl,  T2,  T3  and  P 
server,  such  that  its  mean  and  the  value  of  its  Laplace  t 
at  certain  points  equal  the  corresponding  values  of  the 
service  time  distribution.  To  select  the  parameter  val 
shall  use  a  weighted  non-linear  least  squares  appro 
t  echnique . 

Based  upon  the  two  Laplace  transform  results  that 
discussed,  one  for  either  end  of  the  central  server 
network  spectrum,  the  load-dependent  throughput  rates  of 
subsystem  would  appear  to  be  the  appropriate  values  of  s 
tc  compare  the  two  transforms.  Evaluations  of  the 
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transform  of  the  observed  service  time  distribution  come 
virtually  for  free  once  the  values  of  its  cumulative  distribution 
function  have  been  provided,  though,  and  since  the  performance  of 
the  least  squares  method  improves  when  a  large  number  of 
observations  are  provided,  we  choose  to  match  over  a  wider  range: 
the  ten  equally  spaced  points  from  0,1  to  1.0.  Since  the  mean  of 
a  function  is  equal  to  the  negative  of  the  first  derivative  of 
its  Laplace  transform  evaluated  at  zero,  we  can  ensure  that  the 
parameters  selected  will  result  in  a  server  with  the  correct  mean 
by  including  one  artificial  observation:  a  heavily  weighted 
point  at  s  =  .0001,  with  a  value  selected  so  that  the  slope  of 
the  Laplace  transform  between  zero  and  .0001  is  equal  to  the 
negative  of  the  observed  mean. 

Using  a  non-linear  least  squares  routine  from  the  EMD  package 
[Dixon  1973],  a  set  of  parameters  that  results  in  a  maximum 
transform  error  of  under  27b  with  respect  to  D1,  the  observed  CPU 
service  time  distribution  of  the  Toronto  system,  can  be  found  in 
less  than  three  seconds  of  CPU  time.  When  the  three-stage  Cox- 
type  server  with  these  parameters  is  used  in  the  model  of  the 
Toronto  system,  predicted  CPU  utilization  is  .74,  precisely  the 
observed  value.  The  Laplace  transform  of  D2,  the  distribution 
complementary  to  D1 ,  was  also  evaluated  and  matched  using  the 
techniques  described  in  this  section.  When  the  parameters 
selected  are  used  in  the  model,  predicted  CPU  utilization  is  .61, 
once  again  precisely  the  observed  value. 

This  parameter  selection  methodology  constitutes  a  specific 
and  well-founded  embodiment  of  the  results  of  the  paper.  It  can 
easily  be  extended  to  arbitrarily  complex  Cox-'type  servers. 


Summary 


The  need  in  analytic  modelling 
time  distributions  with  FCFS  queueing 
are  occasions,  illustrated  by  my 
University  of  Toronto  Computer  Centre 
centre  representations  simply  are  not 


for  non- exponen tial  service 
is  well  established.  There 
own  attempts  to  model  the 
system,  when  other  service 
accurate  enough. 


In  this  paper,  I  have  demonstrated  that  the  intuition  derived 
from  the  Pollaczek-Khinchin  equation  for  the  M/G/1  queue  is  not 
applicable  in  a  queueing  network  context.  Matching  the  second  or 
even  third  moment  of  the  observed  service  time  distribution  is 
not  only  insufficient,  but  also  unnecessary  and  frequently 
misleading.  Of  course,  such  models  can  be  calibrated  by 
arbitrarily  adjusting  the  parameters  of  the  CPU  or  the  I/O 
servers,  but  such  tampering  can  only  diminish  cur  confidence  in 
the  predictive  ability  of  the  model. 

I  have  demonstrated  that  in  central  server  queueing  network 
models,  the  effect  of  the  CPU  service  time  distribution  is  fully 
determined  by  the  value  of  its  Laplace  transform  at  several 
points.  This  result  leads  to  parameter  selection  methodologies 
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based  upon  matching  either  certain  percentiles  or  certain  Laplace 
transform  values  of  the  observed  service  time  distribution. 


Of  course,  if  an  observed  service  time 
represented  by  a  server  that  matches  a  large  numb 
then  the  percentiles  and  Laplace  transform  values 
distribution  will  also  be  matched.  Similarly,  if 
of  percentiles  or  Laplace  transform  values 
moments  will  also  be  correct.  Approximations  ar 
modelling,  though,  and  the  results  in  this  paper 
matching  either  percentiles  or  Laplace  transform 
more  fruitful  approach. 
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One  interesting  question  is  the  degree  to  which  the  use  of 
different  classes  of  customers  in  queueing  network  models  can 
diminish  the  need  for  the  techniques  developed  in  this  paper. 
Customer  classes  are  typically  determined  on  the  basis  of  high- 
level  characteristics  such  as  total  service  time  or  the  use  of 
particular  I/O  devices.  These  criteria  are  apt  to  be  unrelated 
to  the  distribution  of  service  times  on  single  visits  to  the  CPU. 
Further,  my  own  experience  with  full  interrupt  traces  from  the 
University  of  Toronto  Computer  Centre  system  indicates  that  the 
CPU  bursts  of  individual  programs  frequently  have  a  rather  high 
coefficient  of  variation.  These  two  factors  argue  that,  at  least 
for  certain  applications,  non-exponential  service  time 
distributions  are  necessary  even  when  multiple  customer  classes 
are  employed. 

Although  the  model  does  not  satisfy  the  local  balance 
assumptions,  computational  requirements  still  can  be  kept  to  a 
minimum  by  using  Norton’s  theorem  to  reduce  the  I/O  subsystem  to 
a  single,  load-dependent  server.  The  model  of  the  Toronto 
system,  for  instance,  was  solved  in  less  than  one  second  of  CPU 
time.  The  techniques  described  in  this  paper  might  also  be  used 
to  select  server  characteristics  for  Shum’s  extended  product  form 
approximation  method,  a  recently  developed  alternative  to 
Norton's  theorem  for  obtaining  solutions  to  queueing  networks 
with  general  service  time  distributions  [Shum  1977]. 

In  summary,  the  techniques  presented  in  this  paper  are 
accurate,  robust  and  practical.  They  have  contributed 
substantially  to  our  understanding  of  the  behaviour  of  our  models 
and  to  our  confidence  in  their  predictive  abilities. 
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Appendix:  The  Lapla 

M/G/1  Queueing  Syst 


ce  Transform  Result  for  the 
em  With  Bounded  Queue  Size 


We  seek 
that  an  M/G/1 
idle. 


an  expression  for  PC,  the  equilibrium  probability 
queueing  system  with  bounded  queue  size,  N,  is 


In 

Pj,  the 
system, 
finds  j 


this  system,  as  with  most  variants  of  the  M/G/1  system, 
equilibrium  probability  that  there  are  j  customers  in  the 
equals  PAj,  the  probability  that  an  arriving  customer 
customers  already  in  the  system. 


In  an  M/G/1  system  with  unbounded  queue  size,  PAj  equals  PDj, 
the  probability  that  a  departing  customer  leaves  behind  j 
customers.  This  relationship  does  not  hold  in  the  M/G/1  system 
with  bounded  queue  size,  however,  since  customers  are  sometimes 
rejected.  For  this  latter  system,  PDj,  0<j<N,  equals  PAj  |  j<N, 
the  probability  that  an  arriving  customer  finds  j  customers 
already  in  the  system,  conditioned  on  the  fact  that  this  arriving 
customer  is  not  rejected. 


The  two  sets  of  probabilities  are  proportional,  and  we  can 
express  the  PAj  in  terms  of  the  PDj  by  considering  PAj  for  j 
equal  to  N+1,  the  probability  that  an  arriving  customer  is 
rejected.  (This  is  observed  by  Cohen  [1969],  among  others.) 
Once  this  probability  is  determined,  the  constant  of 
proportionality,  s,  can  be  computed  by  solving 


N 


(1) 

PA  +  s  2 

PDj  = 

1 

N  +  l  j= 

0 

The 

probability  that 

an  arriving  cu 

stomer  is 

rejected  is  equal  to 

(F  - 

(1-PC))/p,  or  1 

-  1/p 

+  PO/p, 

where  p 

is  the  load  factor 

(the 

mean  arrival 

rate 

divided 

by  the 

mean  service  rate) . 

Intuitively,  the  numerator  of  this  expression  equals  the 
difference  between  the  presented  load  (p)  and  the  accepted  load 
(1-PO),  Since  PO  equals  sPDC ,  (1)  becomes 

N 

(2)  1  -  1/p  sPDO/p  *  s  ^  PDj  =  1 

j=0 
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and  we  find 


1 


(3)  s  =  N 

PDO  +  p  5  PDj 
j=C 

I  stated  earlier  that  PO,  the  equilibrium  probability  that  an 
M/G/1  system  with  bounded  queue  size,  N,  is  idle,  equals  PAO, 
which  in  turn  equals  sPDO.  So  from  (3),  we  have 

PDO 


(4)  PO  =  N 

PDO  +  p  2 
j  =  0 

Cooper  [1972]  shows  that  the  PDj  are  proportional  to  the 
corresponding  quantities  for  an  M/G/1  system  with  unbounded  queue 
size.  These  latter  probabilities,  which  we  denote  by  P*j,  are 
easily  obtained.  The  constant  of  proportionality,  t,  can  be 
found  by  solving 

N 

(5)  t  ^  P* j  =  1 

j  =  0 

So  from  (4)  and  (5),  we  have 

tP*  0 


(6)  PO  =  N 

tP*0  +  pt  2  P* j 
j=0 

P»0 


=  N 

P'O  +  p  5  P*j 
j=0 

But  P*0  equals  1-p,  so 

(1-p) 


(7)  PO  =  N 

( 1-p)  +  P  2  P  '  7 
j=0 

This  is  the  expression  we  have  been  seeking.  It  shows  that 
server  utilization  in  the  M/G/1  queueing  system  with  bounded 
queue  size,  N,  is  a  function  both  of  the  load  factor  and  of  the 
proportion  of  time  that  there  would  be  N  or  fewer  customers  in 
the  corresponding  M/G/1  system  with  unbounded  queue.  Knowing 
this  latter  quantity  is  equivalent  to  knowing  the  full 
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equilibrium  distribution  of  the  number  of  customers  in  the 
unbounded  M/G/1  system,  P*. 

It  is  obvious  that  the  P*j  are  dependent  on  certain 
characteristics  of  the  service  time  distribution.  In  fact,  to 
determine  the  P'j  we  must  fully  specify  the  service  time 
distribution,  since  the  Pollaczek-Khinchin  mean  value  equation 
yields  exactly  M-1  moments  of  the  equilibrium  distribution,  given 
M  moments  of  the  service  time  distribution  [Kleinrock  1975]. 
Thus  no  finite  number  of  moments  will  suffice,  and  nothing  short 
of  the  Laplace  transform  of  the  service  time  distribution  is 
a  dequate . 


Before 
intuitive  li 
for  a  fixed 
service  time 
utilizations 
factors:  th 

size  of  th 
equivalent  n 


continuing  with  the  proof,  consid 
ght.  It  shows  that  for  a  fixed  load 
mean  service  time  and  mean  interarri 
,  in  the  equivalent  network  model) ,  a 
may  be  attained.  This  range  de 
e  service  time  distribution,  the  load 
e  queue,  N  (the  multiprogramming  le 
etwork  model)  . 


er  (7)  i 
factor 
val  time 
range  o 
pends  up 
factor, 
vel,  N+1 


n  a  more 

,  i  .  € .  , 

(or  I/O 
f  server 
on  three 
and  the 
,  in  the 


As  the  load  factor  increases,  the  effect  of  the  service  time 
distribution  becomes  more  pronounced.  This  makes  sense;  the  CPU 
(in  the  equivalent  network  model)  is  becoming  the  bottleneck 
device.  The  role  played  by  the  multiprogramming  level  is  more 
complex  to  analyze.  Clearly,  as  N  goes  to  infinity  the  effect  of 
the  service  time  distribution  diminishes.  It  is  less  obvious 
(but  equally  correct)  that  the  effect  is  also  minimal  for 
extremely  small  values  of  N,  principally  because  the  P'O  term  of 
the  summation  is  determined  solely  by  the  load  factor,  which  is 
held  constant.  Neither  of  these  extreme  cases  is  of  practical 
interest. 


In  (7),  we  established  the  dependence  of  server  utilization 
upon  the  Laplace  transform  of  the  service  time  distribution.  To 
complete  the  proof  we  must  demonstrate  that  the  relationship  is 
an  inverse  one.  We  must  show  that  for  arbitrary  but  fixed  load 
factor  and  N,  the  probability  that  an  M/G/1  queueing  system  with 
unbounded  queue  contains  N  or  fewer  customers  is  inversely 
proportional  to  the  Laplace  transform  of  the  service  time 
distribution . 


We  denote  the  Laplace  transform  by  B*[s].  We  say  that  the 
Laplace  transform  corresponding  to  a  particular  service  time 
distribution  is  greater  than  some  other  Laplace  transform  if  the 
value  of  the  first  transform  evaluated  at  some  point  s*  is  not 
less  than  the  value  of  the  second  transform  evaluated  at  that 
point ,  for  all  s* , 

Note  that  the  moments  of  a  distribution  can  be  recovered  from 
its  Laplace  transform  by  applying 


j  (i) 

=  (-1)  {E*[0]} 


(6) 


j 

E[X  ] 
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where  the  parenthesized  superscript  denotes  differentiation. 
Thus  the  Laplace  transforms  of  all  service  time  distributions 
with  identical  means  will  have  the  same  slope  at  the  origin, 
although  their  values  there,  E*[ 0  ],  may  differ.  The  load  factor 
in  (7) ,  p,  can  be  expressed  in  terms  of  the  transform  by 

(9)  p  =  -L{B*[0]}<i) 

where  L  is  the  arrival  rate. 

Jaiswal  [1968]  derives  the  generating  function  of  the 
equilibrium  state  residence  probabilities  for  the  M/G/1  system  in 
terms  of  the  Laplace  transform  of  the  service  time  distribution, 
as  follows 


(a-1)  (1-p)  B*[L(1-a)] 

(1C)  Q[al  = - 

a  -  L  (1-a)  ] 

The  probabilities  themselves  can  be  recovered  from  the  generating 
function  by  repeated  differentiation  according  to 

(1) 

{Q[0]} 

(11)  p.-j  =  - 

ji 


Proceeding  in 

this  manner,  we  obtain 

(12) 

P'  C 

=  (1-P) 

(13) 

P  •  1 

=  (1-P) 

1 

m 

B^L] 

(14) 

P  •  2 

=  (1-p) 

1  +  L  {E*[  L  ]}  (  1  >  -  B*[  T.  ] 

The  differentiations 
successive  expression 
t  ransf or m . 

become  increasingly  complex,  and 
involves  an  additional  derivative  of 

To  complete  the  proof,  we  rely  on  intuition  regarding  the 
equilibrium  queue  length  distribution  of  an  unbounded  M/G/1 
system  (i.e.,  the  probability  density  function  of  the  P ' j) .  We 
note  that  this  function  has  a  single  maximum,  and  no  local  maxima 
or  minima.  Thus  if  P'O  remains  constant  and  P* 1  decreases,  for 
example,  the  residual  probability  will  be  distributed  among  the 
remaining  P'j,  and  the  probability  that  the  system  contains  N  or 
fewer  customers  will  decrease. 

We  note  that  P’1  is  inversely  proportional  to  the  value  of 
the  Laplace  transform  evaluated  at  L.  For  P'2,  this  inverse 
relationship  is  more  pronounced  (because  of  the  squared  term  in 
the  denominator),  but  is  modulated  by  a  term  involving  the  first 


H-  iD 
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derivative  of  the  transform.  For  large  j,  of  course,  the  P*j 
will  increase  slightly  with  increasing  Laplace  transform  value. 

From  (14)  and  the  discussion  following  it,  we  note  that  the 
moments  of  the  service  time  distribution  do,  in  fact,  play  an 
xplicit  role.  They  are  L~moments  rather  than  central  moments, 
owever,  and  the  value  of  the  Laplace  transform  evaluated  at  L  is 
he  dominant  term. 
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A  QUEUEING  MODEL  OF  AN  RPS  DISK  SYSTEM 


John  Zahorjan 


A  bstract 


In  this  paper  we  present  a  general  model  of  disk  storage 
systems  equipped  with  rotational  position  sensing  (RPS). 
(these  systems  are  distinguished  by  sector  scheduling  at 
the'  channel).  We  show  that  the  number  of  revolutions 
spent  waiting  for  the  channel  due  to  the  RPS  feature  is 
not  geometrically  distributed,  as  was  assumed  in  most 
previous  models.  In  place  of  the  geometric  assumption 
we  give  an  approximate  formula  for  the  mean  and  variance 
of  the  number  of  revolutions  required,  and  use  this 
formula  to  develop  an  open,  single  queue  model 
representing  one  disk  module  of  the  multi-module  system. 
This  model  is  then  compared  to  simulations  and  to  a 
previously  developed  analytic  model.  Finally,  we 
develop  a  closed  queueing  network  model  using  the  single 
queue  model  as  a  component,  and  an  approximate  solution 
method  is  described  by  which  to  obtain  performance 
measurements  from  the  closed  model. 


+ 


This  paper  is  an  internal  SAM  document. 
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1.  Introduction 


This  paper  presents  a 
rotational  position  sensi 
queueing  network  model, 
on  two  levels  of  detail, 
complex  behavior  of  th 
obtained  from  this  Simula 
model.  This  technique 
previously  proposed,  at  t 
cost  of  solution. 


model  of  multi-module  disk  systems  with 
ng  (RPS)  which  can  be  used  in  a  closed 
The  modeling  technique  employed  relies 
the  first  involving  simulation  of  the 
e  system,  and  the  second  using  data 
tion  to  select  parameters  in  a  queueing 
is  shown  to  be  more  accurate  than  those 
he  expense  of  a  small  increase  in  the 


The  disk  storage  systems  to  be  modeled  communicate  with  the 
CPU  and  main  memory  through  a  channel/controller.  Requests  to 
individual  drives  are  served  by  that  drive  in  order  of  arrival 
(FCFS) ;  a  prior  request  must  completely  finish  service  before  a 
later  one  can  begin.  An  I/O  operation  transferring  data  to  or 
from  the  disk  is  composed  of  five  phases  of  service  (figure  1) . 
All  of  these  involve  the  disk  drive  itself,  but  only  two  require 
service  from  the  controller.  The  first  phase  of  service  is  the 
time  spent  waiting  for  the  channel  to  become  free  so  that  the 
seek  command  can  be  issued.  (The  actual  busy  time  incurred  by 
the  channel  in  issuing  the  seek  command  is  negligible,  and  is 
taken  to  be  zero.)  The  next  phase  of  service  is  the  seek. 
Following  the  seek  there  is  a  latency  period  while  the  desired 
sector  rotates  under  the  read/write  head.  It  is  at  this  point 
that  the  rotational  position  sensing  feature  becomes  important. 
Instead  of  serving  the  drives  in  FCFS  fashion,  the  channel  serves 
the  first  drive  which  is  ready  to  transfer  data  (that  is,  the 
first  drive  to  have  the  appropriate  sector  come  under  its 
read/write  head) .  Therefore,  a  drive  which  has  completed  its 
latency  phase  of  service  must  wait  an  integer  (possibly  zero) 
number  of  revolutions  before  it  is  served  by  the  channel.  This 
wait  time,  and  the  actual  data  transfer,  are  the  final  two  phases 
of  service.  It  is  important  to  note  that  although  the  drive  is 
busy  during  all  five  phases,  the  channel  is  busy  only  during  the 
actual  data  transfer. 


Channel 

Wait 


I 


I 


I 


I 


I 


I  Seek  I  Latency  | 


Channel 

Wait 


(  Data 
1  Transfer 


- I  — 

Figure  1 


Lynch  [72]  has  shown  that  in  at  least  some  systems  most 
requests  to  disk  storage  do  not  require  seeks.  The  intuition 
behind  this  is  that  successive  requests  to  a  lightly  loaded  drive 
probably  come  from  the  same  job,  and  therefore  are  likely  to 
require  data  on  the  same  cylinder.  In  these  systems,  the  time 
spent  waiting  for  the  channel  and  performing  the  data  transfer  is 
the  bottleneck.  The  idea  behind  BPS  is  to  reduce  the  extent  of 
this  bottleneck  by  enabling  the  disks  to  sense  their  angular 
position  independently  of  the  channel, 
time  between  completion  of  the  seek 
transfer. 


and  by  minimizing 
and  initiation 


the 

of 


mean 

data 
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(i)  that  the  model  allows  more  than  one  request  per  drive 
in  the  channel  queue  simultaneously, 

(ii)  that  the  queueing  discipline  at  the  controller  does  not 
reflect  the  fact  that  the  RPS  feature  will  result  in  a 
non-FCFS  service  discipline. 


and  (iii) 


that  the  assumption  of  exponential  service  times  is  not 
usually  very  accurate. 


Figure  2  Figure  3 

One  possible  solution  to  all  these  problems  is  proposed  by 
Rose  [76]  .  He  suggests  using  a  calibration  factor  to  adjust  the 
mean  service  times  of  the  I/O  subsystem  so  that  the  model 
statistics  (specifically,  CPU  utilization)  agree  with  empirical 
observations.  This  technique  appears  to  have  little  predictive 
utility  since  one  has  no  reason  to  expect  a  sinale  calibration 
factor  obtained  from  the  base  system  to  be  valid  under  all 
reconfigurations  to  be  modeled,  and  it  is  impossible  to  calibrate 
a  model  of  a  hypothetical  system. 
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A  possible  solution  to  (i)  above  is  to  model  only  the  disk 
drives  or  only  the  channel.  If  the  channel  is  highly  utilized 
the  drives  are  the  bottlenecks,  so  they  are  modeled;  if  the 
channel  is  heavily  utilized  it  is  the  bottleneck,  so  it  is 
modeled.  The  problems  with  this  approach  are  that  the  structure 
of  the  model  must  change  depending  on  channel  utilization,  and 
that  neither  extreme  is  appropriate  for  moderate  channel 
utilization. 

This  paper  is  organized  as  follows.  In  section  2  we  develop 
a  model  of  the  disk  subsystem  with  Poisson  arrivals.  This  model 
is  shown  to  be  accurate  for  all  loads  by  comparisons  with 
simulations.  In  section  3  we  develop  a  computer  system  model 
consisting  of  the  I/O  subsystem  model,  a  CPU  model,  and  a  fixed 
number  of  tokens  (representing  the  multiprogramming  jobs  of  the 
computer  system) .  Performance  measurements  from  this  model  are 
then  validated  against  simulations.  Finally,  in  section  4  the 
I/O  subsystem  model  is  used  to  compare  shortest-seek-time-first 
(SSTF)  scheduling  at  the  disks  with  FCFS. 


2  •  OEsn  ^st ^  Model  of  BPS  Disk  Storage 
2 . 1  BPS  Disk  Storage  Model 
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model  a  multi-module  disk  system  using  a  set  of  M 
in  parallel  (figure  3).  Each  M/G/1  queue  represent 
disk  module.  (There  is  no  explicit  channel  ser 
1  service  time  is  included  as  a  phase  of  disk  ser 
An  M/G/1  queue  consists  of  a  single  queue  and  a  si 
,  Bequests  arriving  at  the  queue  are  serviced  in  order 
1,  The  service  time  of  each  request  is  selected  fr 
arbitrary  service  time  distribution.  Both  the  rate 
Is  and  the  service  time  distribution  are  parameters  of 


/G/1 
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ver. 
vice 
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of 
om  a 
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the 


Classical  M/G/1  queueing  results  show  that  all  that  is  needed 
to  solve  our  model  is  the  arrival  rate  to  the  system  (which  is 
given)  and  the  mean  and  variance  of  the  service  time 
distribution.  If  we  assume  that  the  five  phases  of  service  are 
independent  of  each  other,  the  overall  mean  and  variance  of  the 
service  time  at  a  drive  can  be  computed  as  the  sum  of  the  means 
and  variances  of  the  individual  phases,  respectively.  Our  first 
task,  therefore,  is  to  describe  the  distributions  governing  the 
phases  of  service. 


Of  the  five  phases  of  service,  it  is  assumed  that  at  least 
the  first  two  moments  of  the  distributions  of  all  but  the  channel 
wait  and  extra  rotation  times  are  known  (the  remaining  phases  of 
service  are  highly  installation-dependent,  so  direct  measurement 
is  necessary  to  determine  their  distributions) .  Therefore,  if  we 
can  obtain  the  mean  and  variance  of  these  two  unknown 
distributions,  we  can  compute  the  mean  and  coefficient  of 
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variation  of  the  entire  sevice  time.  Given  these  values,  "i/G/l 
results  can  be  used  to  obtain  performance  measures. 

To  calculate  the  mean  and  variance  of  the  first  phase  of 
service  (channel  wait)  we  need  three  values:  Bi,  the  probability 
that  the  channel  is  busy  when  the  seek  is  requested  by  disk  i, 
CWMilBi,  the  mean  channel  wait  time  for  disk  i  given  that  the 
channel  is  busy,  and  CWVilBi,  the  variance  of  disk  i's  channel 
wait  time  given  that  the  channel  is  busy.  with  these  we  have 

CWHi  =  Bi  *  (CWMilBi) 

CWVi  =  Bi2  *  (CWVilBi) 

where  CWWi  and  CWVi  are  the  overall  channel  wait  mean  and 
variance  experienced  by  disk  i. 

The  problem  of  determing  CWMijBi  and  CWVij Bi  is  one  of 
residual  life.  For  our  purposes  it  is  sufficient  to  note  that 

T(2) 

CWMilBi  =  - 

2T(1) 

T(3) 

CWVilBi  - -  -  (CWMilBi)  2 

3T(1) 

where  T(i)  is  the  ith  moment  of  the  transfer  time  distribution 
taken  about  the  origin  (K leinrockf 75] ) . 

This  leaves  us  with  the  problem  of  determining  Bi.  The 
traditional  method  of  estimating  Bi  is  simply  to  use  the  channel 
utilization  (Abate  and  Dubner  [69],  Wilhelm  [77]).  Simulations, 
however,  show  that  Bi  nearly  always  lies  in  the  range  .07-. 20, 
with  the  smallest  values  occuring  for  high  drive  (and  hence  high 
channel)  utilizations.  This  is  easily  understood  by  noting  that 
at  high  utilizations  queues  tend  to  build  up  at  the  disks. 
Therefore,  most  requests  begin  service  immediately  after  a 
previous  request  has  completed  its  data  transfer.  Since  the 
channel  initiates  all  pending  seeks  before  begining  a  new  data 
transfer,  most  seek  requests  experience  no  delay.  This,  in 
conjunction  with  the  fact  that  this  phase  of  service  is  typically 
only  .1%  of  the  total  service  time,  led  us  to  assume  a  constant 
Bi  of  .1.  The  small  error  in  Bi  resulting  from  this  assumption 
has  a  negligible  effect  on  predicted  performance  measurements. 

With  the  problem  of  the  initial  channel  wait  time 
distribution  out  of  the  way,  all  that  remains  is  to  find  the  mean 
and  variance  of  the  extra  rotation  probability  density  function 
(EBP).  Abate  and  Dubner  [69],  Fuller  and  Baskett  [75],  Omahen 
[75],  and  Wilhelm  [77]  claim  that  the  EBP  should  be  aeometric 
with  the  parameter  being  the  probability  that  the  controller  is 
busy  when  a  data  transfer  is  requested,  A  geometric  distribution 
would  mean  that  the  probability  of  a  drive  finding  the  controller 
busy  would  be  independent  of  how  many  unsuccessful  tries  for  the 
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controller  it  had  made  previously.  Simulations  show  that  the  ERP 
is  definitely  not  geometric,  and  therefore  that  the  independence 
assumption  is  incorrect.  Table  1  lists  two  typical  sequences  of 
the  probability  that  the  controller  is  busy  conditioned  on  the 
number  of  previous  tries.  If  the  distribution  were  geometric  the 
probabilities  in  each  sequence  would  be  constant. 


Previous 


Tries 

6  Drives 

8  Drives 

0 

.2938 

.  3029 

1 

.3441 

.3665 

2 

.  3620 

.4243 

3 

.3661 

.4586 

Table  1 
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in  which  the  drives  must  be  serviced, 
majority  of  the  drives  will  have 
sequential  group  of  m  data  transfers, 
busy  probability  after  the  first 
attributed  to  the  fact  that  if  the  channel  is 
attempt  it  means  that  it  is  likely  that  the  d 
request  has  requested  a  data  transfer  out  of  turn, 
busy  probability  increases  because  the  effect  of 
synchronization  outweighs  the  relatively  minor 
channel  contention  caused  by  one  drive  out  of  man; 
data  transfer. 

Unfortunately,  the  fact  that  the  probability 
in  a  given  attempt  is  so  highly  dependent  on  what 
occurred  makes  analysis  extremely  difficult, 
probability  of  finding  the  controller  busy  on  the 
since  these  attempts  occur  at  nearly  random 
predictable. 


el  busy 
ulat ions 
d  to  the 
nate  in 
te  order 
the  great 


service 

d  in 

an  y 

se  in 

cha 

nnel 

there 

fore 

be 

on  th 

e  f 

irst 

rive  ma 

king 

the 

The 

cha 

nnel 

gettin 

g  ou 

t  of 

decre 

ase 

in 

'  comple 

ting 

its 

of  chan 

nel 

busy 

has  pr 

evio 

usly 

Howe  ver , 

the 

first 

requ 

est , 

intervals,  is 


To  compute  the  probability  that  the  channel  is  found  to  be 
busy  the  first  time  it  is  requested,  we  consider  a  single  drive, 
say  drive  1.  If  PDISKi  is  the  stationary  selection  probability 
of  disk  i  for  each  request  entering  the  system,  the  channel  will 
be  busy  ( 1-PDISKi) *DCH  of  the  time  with  drives  other  than  drive 
1,  where  UCH  is  the  channel  utilization.  However,  since  only  one 
drive  may  use  the  channel  at  a  time,  this  utilization  is  confined 
to  times  when  drive  1  is  not  using  the  channel.  The  fraction  of 
time  that  drive  1  is  not  using  the  channel  is  1-UCH*PDISKi . 
Since  drive  1  will  request  the  channel  only  when  it  is  not 
already  using  it,  the  probability  that  it  finds  the  channel  busy 
is  the  channel  utilization  attributable  to  the  other  drives 
divided  by  the  fraction  of  the  time  drive  1  is  not  using  the 
channel.  Therefore,  POi{OCH),  the  probability  that  the  channel 
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is  busy  after  0  extra  revolutions  by  disk  i  in  a  system  with  mean 
channel  utilization  TJCH,  is  given  by 

(1)  POi  (UCH)  =  (1  -  PDISKi)  *  UCH  /  (1  -  PDISKi^'UCH) 


In  the  open  system  the  channel  utilization  is  simply  the 
rate  to  the  system  multiplied  by  the  channel  service  time. 
PDISKi  is  known,  it  is  a  simple  matter  to  compute  POi 
given  arrival  rate  to  the  open  system. 


arrival 
Since 
for  any 


Table  2  compares  predicted  values  of  POi(TTCH)  with  those  from 
simulations  for  a  6  drive  system.  All  predictions  come  within 
3 ‘’f  of  the  observed  probabilities.  The  prediction  gives  good 
results  for  all  reasonable  values  of  the  first  two  moments  of  the 
seek  and  transfer  time  distributions. 


As  mentioned  previously,  the  probability  of  finding  the 

channel  busy  on  the  second  and  successive  tries  is  too  highly 

dependent  on  previous  events  to  allow  a  reasonable  analytic 

prediction.  Simulation  runs,  however,  show  that  Pni,  the 

probability  that  disk  i  finds  the  channel  busy  on  the  nth  attempt 
after  n-1  unsuccessful  attempts,  n  =  1,2,...,  have  the  same  form 
as  POi  as  functions  of  UCH.  A  function  of  the  form 


(2)  (1  -  a*PDISKi)  *  UCH  /  (1  -  b^PDI SKi* UCH) 

can  be  fitted  to  the  Pni,  where  parameters  a  and  b  will  depend 
only  on  the  number  of  drives.  This  is  done  by  fitting  to  one 
simulation  of  Pni  for  high  utilization  and  another  for  low 


utilization. 

Table  3  contain 

Pli  for  the 

6  drive  system 

this  form  fits 

the  empirical  ' 

Ob  served 

Pre  dieted 

.  5738 

.  5783 

.5153 

.  5147 

.3694 

.  3725 

.2938 

.2957 

Table  2 


list  of  observed  and  predicted 
As  the  table  shows,  a  function  of 
e  very  well. 


Observed 

Predicted 

.  6645 

.6665 

.  5898 

.  5947 

.  4433 

.4328 

.  3441 

.  3445 

Table  3 


It  is  possible  to  make  accurate  predictions  of  all  the  Pni 
using  functions  of  the  form  of  (2).  However,  it  is  unnecessary 
to  fit  curves  to  more  than  the  first  few  terms.  The  number  of 
terms  actually  necessary  for  a  given  accuracy  is  dependent  on  the 
channel  utilization.  If  channel  contention  is  low  enough  that 
most  requests  are  serviced  in  the  first  few  attempts,  the  first 
one  or  two  terms  will  do.  If  channel  contention  is  such  that  many 
requests  are  delayed  for  more  than  a  few  revolutions,  more  terras 
will  be  needed.  We  chose  to  predict  only  the  first  three  terras, 
since  the  vast  majority  of  requests  are  served  in  the  first  three 
tries.  All  channel  busy  probabilities  beyond  the  third  term  are 
taken  to  be  equal  to  P2i(nCH).  This  decision  is  based  on  the 
observation  (from  simulation)  that  in  systems  with  a  large  number 
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of  highly  utilized  disks,  the  channel  busy  probabilities  after 
the  second  term  are  nearly  constant.  For  fewer  disks  or  lower 
utilizations  most  requests  are  serviced  in  the  first  few 
rotations,  so  the  small  inaccuracy  incurred  by  assuming  constant 
busy  probability  after  the  third  attempt  is  insignificant. 

Once  the  channel  busy  probabilities  are  obtained,  the 
probability  P_SDCCESSni  of  a  request  from  disk  i  being  serviced 
after  n  unsuccessful  attempts  at  data  transfer  can  be  computed  as 

_n-J 

P_SUCCESSni  =  (1  -  Pni(OCH))  |1~  P  ji  (UCH) 

j=1 

Assuming  constant  probability  after  the  third  term,  the  mean  of 
the  number  of  extra  rotations  E  is 

(3)  Ri  =  POi  (1  -  P1i)  P0i*P1i(2  -  P2i)  /  (1  -  P2i) 
and  the  variance  V2  is 

(4)  V2i  =  P0i(1  -  Pli)  +  (POi^^'PliCU  -  3P2i  +  (P2i)2i)  / 

(1  -  P2i)  2)  -  (Ri)  2 

Similar  formulas  can  be  developed  for  distributions  obtained  by 
predicting  more  than  three  terms.  This  might  be  done  to  obtain 
greater  accuracy  for  systems  where  a  large  percentage  of  the 
requests  were  not  serviced  in  the  first  three  attempts. 

Assuming  independence  of  the  five  phases  of  service,  the  mean 
and  variance  of  the  total  service  time  is  the  sum  of  the  means 
and  variances  of  the  phases,  respectively.  With  these  we  can 
calculate  the  squared  coefficient  of  variation  for  disk  i 
(variance  /  mean2) ,  cvi2,  and  use  the  Pollacz ek-K hinchin  mean 
value  formula  to  obtain  performance  predictions  (Kleinrock  [75]). 
The  measures  to  be  used  to  compare  models  are  average  time  in 
service,  Xi,  average  time  in  system  (residence  time),  Ti,  average 
drive  queue  length,  QDRIVEi,  and  average  queue  length  of  jobs 
waiting  for  the  channel  to  perform  a  data  transfer,  QCHANNEL  (the 
subscript  i  in  each  case  denotes  the  ith  server) .  The  formula 
for  each  of  these  is: 


(5)  Xi  =  ^  (mean  service  time  of  phase  j) , 

j  =  channel  wait,  seek, latency, extra  rotation, 
data  transfer 

(6)  Ti  =  Xi(1  +  UDRIVEi(1  +  CVi2)/(2(1  -  UDFIVEi)  )  ) 

(7)  QDRIVEi  =  UDFIVEi  +  UDRIVEi2  (1  +  CVi2)/(2(1  -  UDFIVEi)) 

(8)  QCHANNEL  =  (R  *  Ti  mean  data  transfer  time  )  * 

UDRIVEi  *  m  /  Xi 
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where  UDRIVEi  is  the  drive  utilization.  In  the  open  system 
UDEIVEi  is  given  by 

UDRIVEi  =  L  *  PDISKi  *  7i 
where  I  is  the  arrival  rate  to  the  entire  system. 

Table  4  gives  a  list  of  parameters  a  and  h  of  equation  (2) 
for  4,6,  and  8  drive  systems.  These  parameters  were  obtained  by 
interpolating  through  channel  busy  probabilities  observed  under 
different  loads  in  simulations  of  systems  with  equal  loading  on 
all  disks.  The  performance  of  an  m  module  disk  system  under  the 
assumptions  of  section  2.1  can  be  examined  for  various  loads  by 
substituting  these  parameters  into  equations  (3)  and  (4),  (For 
systems  not  meeting  the  assumptions,  simulations  of  the  system 
under  two  different  loads  can  be  run  to  obtain  the  parameters,  or 
the  recommendations  given  in  section  2.3  of  how  to  adjust  for 
changes  in  the  assumptions  can  be  followed.)  Given  these  values, 
equations  (6) -(8)  can  be  used  to  obtain  performance  measurements 
of  the  modeled  system. 

For  P2i 
4  6  8 


0.835  -.  10P  -3.  21 
1.172  0.9GH  -3.73 
Table  4 


2 . 2  Comparisons  with  Other  Models 

In  this  section  we  compare  the  model  of  section  2.1  to 
Omahen*s  model  (Omahen  [75]).  Simulation  is  used  as  the  method 
of  validation.  Predictions  of  drive  average  queue  length,  drive 
utilization,  average  service  time,  average  time  in  system,  and 
channel  average  queue  length  will  be  used  as  the  basis  of 
c  omparison. 

Table  5  lists  the  predictions  of  the  three  models  for  4, 6, and 
8  drives  under  a  number  of  loads.  Our  model  stays  within  of 

the  simulations  in  nearly  every  case.  The  greatest  errors  occur 
in  the  time  in  system  and  queue  length  statistics  under  heavy 
leading  conditions.  This  is  because  the  assumption  that  the  five 
phases  of  service  are  independent  leads  to  a  coefficient  of 
variation  which  differs  from  the  actual  coefficient  of  variation. 
The  effect  of  the  error  in  the  CV  becomes  more  significant  as 
the  load  increases,  as  can  be  seen  in  the  table. 

Omahen's  model  in  general  does  not  perform  as  well.  His 
assumption  that  the  extra  rotation  and  data  transfer  phase  of 
service  is  exponentially  distributed  introduces  considerable 
error  into  both  the  mean  and  variance  of  the  ERP.  These 
inaccuracies  affect  the  mean  and  CV  of  the  entire  service  time. 


___4 _ 6 _ 8 _ 

A  1.400  0.106  -.568 
B  2.  661  0.804  0.  403 
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MODEL 

lAT 

DDRV 

ST 

TIS 

QDRV 

QCH 

4  drive 

systems 

S 

1700 

.817 

5539.1 

19666 

2.906 

1. 265 

Z 

.818 

5568.8 

19966 

2 . 936 

1.261 

0 

.791 

5378.8 

18468 

1.691 

s 

2500 

.488 

4862.5 

7383 

0.741 

0.591 

z 

.489 

4895. 1 

7414 

0.741 

0.588 

0 

.  487 

4871 . 2 

7732 

0.946 

s 

3000 

.396 

4707.0 

6462 

0.543 

0.446 

z 

.395 

4742  .6 

6382 

0.531 

0,439 

0 

6  drive 

.394 

systems 

4729.7 

6610 

0.741 

s 

1350 

.905 

7278.2 

58517 

7.323 

2.920 

z 

.907 

7350.1 

56526 

6.978 

2.908 

0 

.792 

6418.  1 

23221 

2.899 

s 

1500 

.720 

6440.4 

17462 

1.954 

2.045 

z 

.717 

6455.9 

16816 

1 . 86  8 

2.021 

0 

.664 

5980.3 

23221 

2.  899 

s 

2500 

.335 

4998.0 

621  4 

0.432 

0.638 

z 

.334 

5014.8 

6390 

0.  426 

0.636 

0 

- 8 

dri  ve 

.333 

systems- 

4996.5 

6555 

0.997 

s 

1500 

.550 

6637.4 

12343 

1.023 

2.135 

z 

.545 

6541.3 

11486 

0.957 

2. 078 

0 

.514 

6171.2 

10593 

2.  444 

s 

2000 

.351 

5557.3 

74  53 

0.471 

1.075 

z 

.342 

5478.2 

7121 

0.445 

1.027 

0 

.337 

5405.2 

71  82 

1 .450 

s 

2500 

.254 

5084.5 

6084 

0.  304 

0.667 

z 

.253 

5079.2 

6034 

0.3  01 

0.662 

0 

.253 

5067.8 

6148 

1.025 

Key : 

lAT  -  Inter-arrival  time  TIS  -  Time  in  system 

ODRV  -  Drive  utilization  QDRV  -  Avg.  drive  queue  length 

ST  -  Avg.  drive  service  time  QCH  -  Avg,  channel  queue  length 


Table  5 
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so  that  model 
40%  in  some 
does  not  lend 
model . 


statistics  vary  from  the  simulation  by  more  than 
cases.  It  should  also  be  noted  that  Omahen's  model 
itself  easily  to  use  in  a  closed  queueing  network 


2.3 


Altering 


the  Assumptions 


Because  the  computation  of  the  queueing  model's  parameters 
relies  on  numbers  obtained  from  simulations  with  specific  seek 
and  transfer  time  distributions,  it  is  not  clear  that  these 
numbers  are  valid  for  models  of  systems  with  different  seek  or 
transfer  time  distributions.  The  same  difficulty  of  analysis 
that  required  the  use  of  simulation  to  set  these  paramsters 
originally  prohibits  an  a  priori  analysis  of  the  precise  effects 
of  the  possible  changes  to  the  service  time  distributions.  With 
this  in  mind,  a  few  guidelines  are  presented  that  could  be  useful 
in  adjusting  the  model  to  accomodate  a  new  situation.  However, 
the  best  method  of  doing  this  is  to  run  the  two  simulations  that 
are  required  to  tailor  the  model  to  the  specific  system. 


The  first 
d istr ibution. 
comparing  the 
transfer  time 


change  considered  is  altering  the  transfer 
The  effect  of  this  change  was  examined 
results  of  simulations  of  systems  with  diffe 
distributions  with  the  predictions  for  t 


systems  obtained  from  the  model  of  section  2.  Simulations 
run  with  transfer  time  means  of  4.17  and  8.35  ms  and  eithe 
constant  transfer  time  distribution  or  an  exponen 
distribution  (the  parameters  of  section  2.1  were  obtained 
systems  with  a  constant  8.35  ms  data  transfer  time), 
difference  between  the  simulations  and  the  model  never  exe 
4%.  This  leads  to  the  conclusion  that  the  parameter  settings 
insensitive  to  the  data  transfer  distribution.  This  conclu 
is  supported  by  the  fact  that  the  data  transfer  distribu 
intuitively  appears  to  have  very  little  to  do  with  the 
factor  determining  the  extra  rotation  distribution.  The 
transmission  distribution  does  not  seem  to  significantly  af 
the  likelihood  that  a  disk  module  will  get  out  of  synchroniza 
with  the  other  modules  in  performing  data  transfers.  So 
change  in  the  transfer  time  distribution  should  have  li 
effect  on  the  EBP. 
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Adjusting  for  changes  to  the  seek  time  distribution  is 
equally  simple.  Simulations  of  the  disk  system  under  the 
condition  that  the  seek  arm  was  twice  as  fast  were  compared  with 
predictions  of  the  queueing  model.  The  error  in  these 
predictions  was  less  than  6%.  This  is  a  strong  indication  that 
the  a*s  and  b*s  derived  from  the  initial  simulations  can  be  used 
for  altered  seek  time  distributions  without  introducing 
significant  error. 


To  model  systems  with  more  than  one  controller  we  need  to 
find  the  probability  that  at  least  one  of  the  controllers  is  free 
when  a  data  transfer  is  attempted.  If  we  compute  the  Pni  as 
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before  and  then  divide  by  the  number  of  controllers,  we  will  have 
an  approximation  to  the  probability  of  a  single  controller  being 
busy.  Assuming  that  the  usage  of  a  controller  is  independent  of 
the  others,  the  probability  that  all  controllers  will  be  busy 
after  n  unsuccessful  attempts  at  data  transfer  is 

c 

(Pni/c) 

where  c  is  the  number  of  controllers.  Simulations  of  6  and  8 
drive  systems  have  shown  this  approximation  to  be  very  good. 
This  is  reasonable  in  light  of  the  fact  that  the  probability  of 
finding  no  controller  free  when  there  are  two  or  more  of  them  is 
so  small  that  even  for  a  large  number  of  heavily  used  disks  more 
than  90%  of  the  requests  are  serviced  on  the  first  attempt.  The 
approximation  therefore  can  sustain  fairly  gross  inaccuracies  in 
higher  terms  without  much  effect  on  the  overall  accuracy. 

The  final  change  to  the  assumptions  to  be  considered  is  that 
of  changing  the  distribution  of  the  load  among  the  disk  modules. 
Assuming  that  the  parameters  a  and  b  remain  valid  for  the  new 
system,  we  can  find  the  EBP  of  a  specific  disk  module  which 
processes  PDISKi  of  the  load  by  computing  POi  through  P2i 
according  to  (2)  . 

Simulations  show  that  the  assumption  that  parameters  a  and  b 
are  insensitive  to  the  distribution  of  the  load  was  valid. 
Predictions  of  our  model  are  within  4%  of  the  results  of 
simulations  for  all  performance  measures  when  the  load  on  the 
disks  was  redistributed.  If  better  approximations  were  required, 
two  simulations  with  different  arrival  rates  could  be  run  to 
obtain  parameters  a  and  b  specific  to  the  actual  distribution  of 
the  load  among  the  disks. 


3 •  C Ipsed  Sy ste m  Model  with  BPS  Disk  Storage  Subsystem 


3 . 1  Hode 1  and  Solution 

The  incorporation  of  the  open  system  model  of  section  2  into 
closed  queueing  network  models  is  examined  by  dealing  with  a 
particular  type  of  closed  model,  the  central  server  model  (Buzen 
[71])  (figure  4).  As  the  name  implies,  in  this  type  of  model 
there  is  a  single  central  server  from  which  a  token  must  receive 
service  before  proceeding  to  one  of  the  other  servers.  The 
central  server  represents  the  computer  system's  CPU  and  the  other 
service  centers  represent  the  various  I/O  devices.  The  tokens  in 
the  model  represent  the  jobs  of  the  computer  system,  so  that  the 
number  of  tokens  in  the  model  is  taken  to  be  the  maximum 
multiprogramming  level  of  the  system.  The  assumption  behind  the 
fixed  number  of  tokens  is  that  there  is  always  a  backlog  of  jobs, 
and  a  job  which  acquires  all  its  required  service  is  immediately 
replaced  by  a  (statistically  identical)  job. 


.-t-  (D 
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For  simplicity,  the  particular  central  server  model  used 
consists  of  only  a  central  server,  representing  the  computer 
system’s  CPU,  and  the  I/O  subsystem  model  described  in  the 
previous  section.  However,  the  discussion  of  the  solution  of 
this  model  can  be  applied  to  any  closed  queueing  network 
containing  the  EPS  disk  storage  subsystem  model  as  a  component. 

In  order  to  apply  the  results  of  section  2,  the  channel 
utilization  must  be  known.  Since  we  are  dealing  with  a  closed 
system,  there  is  no  way  to  determine  the  channel  utilization 

xcept  to  solve  the  system.  We  therefore  require  a  knowledge  of 

he  channel  utilization  to  determine  the  model’s  parameters  so 
that  it  may  be  solved,  and  we  need  to  solve  the  model  in  order  to 
discover  the  channel  utilization. 

Because  of  this  circularity,  we  use  an  iterative  approach. 
We  first  guess  a  channel  utilization  and  solve  the  model  with 
parameters  formed  from  this  estimate  of  channel  utilization.  A 
new  guess  is  obtained  from  the  solution,  and  the  model  is  solved 
once  again,  with  parameter  settings  based  on  the  new  guess.  At 
each  step,  the  average  of  the  utilization  provided  as  input  and 
the  utilization  predicted  by  the  model  is  used  as  input  to  the 

next  step  of  the  iteration.  This  process  can  be  represented  as 

follows: 


UCH(O)  =  initial  guess 

DCH(i-H)  =  (OCH  (i)  +  f(UCH(i)))  /  2 

where  the  function  f  represents  the  value  of  channel  utilization 
obtained  by  solving  the  model  with  parameters  based  on  a  channel 
utilization  of  UCH (i) . 

The  average  is  taken  because  if  the  initial  guess  is  too  low 
the  model’s  average  drive  service  time  will  also  be  too  low  due 
to  the  underestimate  of  channel  contention.  Therefore,  the  model 
will  show  a  greater  throughput  at  the  drives  than  would  occur  in 
the  actual  system,  and  so  the  predicted  channel  utilization  will 
be  too  high.  Similarly,  if  the  guess  is  too  high,  predicted 
channel  throughput  will  be  too  low  and  channel  utilization  will 
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be  low.  The  actual  value  of  channel  utilization  therefore  lies 
in  between  the  initial  guess  and  the  prediction,  and  so  the 
iteration  using  the  average  of  the  guess  and  the  prediction  will 
converge.  The  iteration  is  terminated  when  the  difference 
between  the  guessed  and  predicted  utilizations  becomes  small 
enough. 


3.2  An  Efficient  Solution  Technique  for  the  Closed  Model 

Since  we  must  solve  the  model  repeatedly  with  different 
guesses  for  channel  utilization,  an  efficient  solution  technique 
is  required.  The  one  employed  for  examples  in  this  paper  is 
Zahorjan's  approximate  solution  technique  (Zahorjan  [77],  also 
described  briefly  in  Sevcik  et  al.  [77]),  This  is  a  modification 
of  Sauer’s  approximation  method  (Sauer  and  Chandy  [75])  and 
involves  a  reduction  of  the  m  I/O  servers  into  a  single  load 
dependent  server.  The  service  rate  of  this  server  when  n 
customers  are  present  is  the  throughput  rate  of  the  I/O  submodel 
when  the  CPU  service  time  is  set  to  zero  and  the  system  is  solved 
with  n  customers,  Sauer' s  technique  obtains  these  throughputs  by 
assuming  exponential  service  time  distributions  everywhere  and 
solving  the  resulting  model  using  relatively  efficient  local 
balance  methods  (Baskett  et  al.  [75]).  However,  the  assumption 
of  exponential  holding  times  has  been  shown  to  introduce 
unacceptable  error  (Zahorjan  [76],  Sevcik  et  al.  [77]). 
Therefore,  a  global  balance  solution  method  is  used  so  that 
general  service  time  distributions  may  be  modeled.  The  problem 
with  solving  the  I/O  subsystem  model  using  global  balance  is  that 
the  amount  of  computation  required  grows  combinatorially  with  the 
number  of  service  centers  and  number  of  tokens  in  the  system,  and 
so  can  be  unmanageable  even  for  apparently  simple  models  (Levy 
[  77])  .  Our  technique  therefore  replaces  a  single  solution  of  an 
m  server  model  by  m-1  solutions  of  2  server  models.  The  two 
server  models  solved  consist  of  two  I/O  servers  in  parallel.  A 
single  load  dependent  server  is  formed  using  the  rates  computed 
from  the  two  server  model  and  replaces  the  two  I/O  servers  in  the 
I/O  subsystem  model.  The  process  of  replacing  two  servers  by  a 
single  load  dependent  server  continues  until  there  is  only  one 
load  dependent  server  left.  This  server  approximates  the 
behavior  of  the  entire  I/O  subsystem.  Finally,  a  model 
consisting  of  the  CPU  server  and  the  single  I/O  subsystem  server 
is  solved  to  obtain  performance  measurements  of  the  entire 
system. 

Since  the  final  system  has  a  composite  service  center  in 
place  of  the  service  centers  for  the  individual  drives, 
statistics  obtained  from  the  solution  of  the  model  pertain  to  the 
composite  service  center,  not  to  the  individual  I/O  service 
centers.  However,  mean  performance  measurements  for  the  disk 
servers  can  be  extracted  from  those  of  the  composite  server. 

The  channel  utilization,  UCH,  can  be  computed  from  the 
throughput  of  the  compositie  I/O  server  by 
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UCH  =  (I/O  throughput)  *  D 

where  D  is  the  mean  data  transfer  time  of  the  disks.  Similarly, 
the  utilization  of  the  ith  drive,  UDRIVEi,  can  be  found  by 

DDRIVEi  =  PDISKi  *  (I/O  throughput)  *  Xi 

where  Xi,  the  average  service  time  at  disk  i,  is  given  by  (5). 
The  average  number  of  jobs  waiting  to  perform  a  data  transfer  can 
then  be  found  as 


m_ 

OCHANNEL  =  >_  (Ri  *  rotation  time  +  D)  *  UDRIVEi  /  Xi 

j=1 

Deriving  drive  queue  lengths  is  more  difficult.  In  the  case 
of  m  identical  drives  with  equal  branching  probabilities,  the 
average  drive  queue  length,  QDRIVEi,  is  given  by 

QDRIVE  =  (composite  service  center  avg.  queue  length)  /  m 

If  the  system  does  not  have  equal  branching  probabilities  to  the 
m  disks,  or  the  m  disks  are  not  identical,  statistics  on  the  I/O 
subsystem  can  be  obtained  by  solving  an  open  system  with  m 
parallel  service  centers  representing  the  m  disk  drives.  The 
arrival  rate  to  the  open  system  should  be  set  equal  to  the 
throughput  of  the  composite  service  center  in  the  closed  system. 
This  method  introduces  some  inaccuracy  because  arrivals  in  the 
open  system  will  be  Poisson,  and  in  the  closed  system  they  will 
not.  However,  it  provides  an  easy  means  of  getting  reasonable 
approximations  to  the  desired  measurements. 


3 . 2  Comp^isons  with  Si  mulat  ion  Mode  Is 

The  computer  system  model  of  figure  4  was  used  as  a  basis  of 
comparison  between  the  closed  queueing  model  and  simulation 
models  of  I/O  subsystems.  This  model  was  chosen  as  it  is  a 
simple  closed  central  server  model,  and  therefore  the  possible 
causes  of  error  in  the  analytic  model  could  be  most  easily 
ident if ied. 

Predictions  of  the  analytic  model  were  compared  to 
simulations  with  a  number  of  settings  for  the  number  of  drives 
and  the  number  of  tokens  in  the  system.  The  predictions  of  the 
analytic  and  simulation  models  for  drive  and  CPU  utilizations  and 
average  queue  lengths  are  listed  in  table  6. 

4.  Comparisons  of  FCFS  and  SSTF  Di^  Scheduling 

Although  all  of  the  examples  presented  so  far  have  used 
statistics  about  the  IBM  3330  disk  system  under  FCFS  scheduling 
for  the  parameter  settings  of  the  models,  it  is  possible  to  adapt 
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these  models  to  other  situations.  One  possible  change  is  to 
model  a  disk  scheduling  policy  other  than  FCFS.  In  this  section 
we  examine  the  performance  improvements  obtainable  by  employing  a 
non-FCFS  scheduling  discipline  (in  particular,  shortest-seek¬ 
time-first  (SSTF) )  at  the  disk  modules.  This  is  done  by  adapting 
the  open  model  of  section  2  to  the  new  scheduling  discipline  and 


comparing  the 

predictions 

of  the 

model 

with 

those 

queueing. 

Model 

MPL 

PPMIE 

PCPU 

QDRIVE 

OCPU 
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4 

4 
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.345 
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.  593 

.  233 

.  923 

.308 
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4 

6 

.718 

.  278 

1.393 

.426 
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.  697 

.  263 

1.403 

.  390 

s 

4 

8 

.  782 

.294 

1 .885 

.457 

z 

.  762 

.  280 

1. 889 

.445 

s 

6 

4 

.  458 

.269 

.602 

.387 

z 

.  454 

.  265 

.611 

.  336 

s 

6 

6 

.  583 

.305 

.919 

.481 

z 

.  567 

.  292 

.  889 

.446 

s 

6 

8 

.  665 

.326 

1.245 

.530 

z 

.  642 

.314 

1.245 

.531 

Table  6 

If  we  ignore  the  possible  effects  of  channel  contention,  the 
major  component  of  a  disk  access  is  the  seek  time.  FCFS 
scheduling  does  not  use  information  about  the  seek  time  required 
to  service  a  request,  and  therefore  will  bypass  requests  near  the 
current  position  of  the  r/w  head  to  service  a  request  near  the 
edge  of  the  disk  simply  because  the  further  request  has  been 
waiting  the  longest.  This  results  in  greater  expected  service 
and  waiting  times  than  could  be  obtained  by  some  other  policies, 
but  it  minimizes  the  variance  of  the  wait  time  (and  thus  is 
equally  ’’fair"  to  requests  for  all  cylinders)  . 

The  relatively  long  service  time  of  FCFS  means  that  the  I/O 
subsystem  will  saturate  at  loads  that  could  be  handled  by  other 
policies.  Denning  [67]  proposed  shortest-seek-time-first  (SSTF) 
scheduling  to  increase  the  throughput  capabilities  of  a  disk 
system.  Under  SSTF  the  request  that  is  selected  for  service  when 
the  disk  module  becomes  free  is  the  one  which  requires  the 
minimum  movement  of  the  r/w  head.  This  reduces  the  expected  seek 
time,  and  thus  the  expected  service  time,  but  results  in  a  high 
waiting  time  variance.  This  high  variance  comes  about  because, 
under  heavy  loads  (which  is  when  SSTF  should  be  most  useful),  the 
head  will  tend  to  stay  in  the  center  of  the  disk.  There  will 
very  infrequently  be  a  series  of  requests  that  meet  the  shortest 
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seek  time  criterion  and  leads  to  the  extremities  of  the  disk,  and 
so  the  head  stays  over  the  central  cylinders.  For  this  reason, 
requests  to  the  middle  cylinders  will  receive  prompt  service, 
while  requests  for  the  outermost  cylinders  will  be  discriminated 
against  and  will  have  to  wait  an  unfairly  long  time. 

To  adapt  the  model  of  section  2  to  a  non-FCFS  queueing 
discipline  we  need  only  adjust  the  seek  time  mean  and  variance  to 
reflect  those  of  the  seek  time  distribution  experienced  under  the 
new  queueing  discipline.  As  queueing  disciplines  at  disks  do  not 
generally  involve  any  change  to  the  distributions  of  the  latency 
or  data  transfer  time  phases  of  service,  the  new  queueing 
discipline  can  be  modeled  simply  by  using  a  seek  time  mean  and 
variance  appropriate  to  the  new  discipline,  and  the  same  latency 
and  data  transfer  distributions  as  under  FCFS. 

The  examples  presented  in  this  section  are  of  the  IBM  3330 
under  FCFS  and  SSTF  scheduling  with  the  assumption  that  requests 
arrive  randomly  and  are  distributed  uniformly  over  the  cylinders. 
Rotational  latency  is  taken  to  be  uniform  from  0  to  16.7  ms.  (one 
disk  revolution).  Data  transfer  time  is  taken  to  be  constant; 
statistics  are  collected  assuming  both  one  quarter  and  three 
quarters  of  a  revolution. 

Wilhelm  [73a]  has  derived  the  mean  and  variance  of  the  seek 
distance  for  FCFS  and  SSTF  scheduling  under  the  assumption  that 
requests  are  independent  and  spread  uniformly  over  the  cylinders. 
Using  a  linear  function  from  10  to  55  ms  as  an  approximation  to 
the  time  required  to  move  a  given  number  of  cylinders  for  an  IBM 
3330  disk,  and  the  fact  that  the  3330  has  411  cylinders,  the  mean 
seek  times,  MST,  and  variance  of  seek  times,  VST,  for  FCFS  and 
SSTF  are 


MST 


10  +  .  1095  *  412  /  3 


FCFS 


VST 


(.  1095)2  ♦  412  *  409  /  18 


FCFS 


(9)  MST 


10  +  .  1095  *  (n+3)  /  (2  (n  +  1)  (n+2) ) 


SSTF 


(10)  VST 


(.  1095)  2  (n3  +  iin2  +  I6n  +  1)  / 


SSTF 


(4 (n  +  1) 2  (n+2)  (n+3) ) 


Using  these  functions,  statistics  were  obtained  for  mean  wait 
and  service  times,  and  mean  queue  length  under  a  number  of  loads 
for  4,  6,  and  8  module  systems.  The  data  transfer  time  was  taken 
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to  be  either  one  quarter  or  three  quarters  of  a  revolution.  The 
predictions  of  the  models  are  shown  in  figure  5. 

Since  the  SSTF  seek  time  distribution  depends  on  the  queue 
length,  the  coefficient  of  variation  of  the  disk  service  time, 
and  therefore  the  structure  of  the  service  center,  also  depend  on 
the  queue  length.  This  is  a  situation  which  is  difficult  to 
model.  To  overcome  this  problem,  the  seek  time  mean  and  variance 
in  the  model  for  all  queue  lengths  are  taken  to  be  those 
predicted  by  using  the  mean  queue  length  in  equations  (9)  and 
(10).  This  approximation  is  justified  by  noting  that  the  queue 
length  under  SSTF  is  self-regulating  and  so  should  tend  to  stay 
near  the  mean.  As  in  the  case  of  the  closed  systems  considered 
in  section  3,  an  iterative  technique  based  on  the  channel 
utilization  was  employed  to  solve  the  system.  In  this  case, 
however,  the  iteration  was  stopped  when  the  result  queue  length 
was  sufficiently  close  to  the  predicted  one. 

Figure  5  shows  that  SSTF  can  significantly  improve  the  mean 
performance  of  a  set  of  disk  modules  over  that  attainable  with 
FCFS  scheduling.  However,  it  is  apparent  that  the  benefits  of 
SSTF  are  highly  dependent  on  the  channel  utilization.  In  the 
cases  of  high  channel  utilization,  decreasing  the  mean  seek  time 
does  very  little  to  affect  the  average  service  time  of  requests 
to  the  modules.  This  is  to  be  expected,  since,  as  channel 
utilization  increases,  the  channel  becomes  more  of  a  bottleneck 
in  the  I/O  subsystem.  Reducing  the  mean  seek  time  in  such  cases 
can  only  lead  to  longer  average  channel  queue  lengths,  not  to  a 
significantly  reduced  service  time.  If  the  channel  utilization 
is  low  enough,  the  m  module  system  becomes  an  approximation  of 
the  single  module  system  discussed  by  Teorey  and  Pinkerton  ‘ 72]. 
In  these  cases  the  seek  and  latency  phases  of  service  represent 
the  bottleneck  in  the  I/O  subsystem,  so  reducing  the  mean  seek 
time  does  result  in  an  improvement  of  performance. 


5 •  Conclusions 

We  have  developed  a  model  of  RPS  equipped  disk  storage 
systems  which  is  more  accurate  than  those  previously  prosposed. 
Our  model  relies  on  simulation  to  obtain  parameters  for  phases  of 
service  whose  complexity  had  resulted  in  significant  errors  in 
previous  analyses.  The  actual  model  developed  was  a  single 
server  M/G/1  queue.  This  model  was  then  incorporated  into  a 
closed  queueing  network  and  an  approximate  solution  method 
described  for  obtaining  performance  measurements.  The 
predictions  of  the  open  and  closed  models  were  within  3  an  555 
respectively  for  utilizations,  and  7  and  12%  for  queue  lengths, 
as  validated  by  simulations. 

Finally,  application  of  the  model  was  demonstrated  by  a 
comparison  of  SSTF  scheduling  to  FCFS  scheduling.  The  model 
demonstrates  that  SSTF  exhibits  the  greatest  improvement  over 
FCFS  at  high  loads,  as  is  expected,  and  that  the  major  advantage 


103 


9 


— I — I — I — I — h— 1 — I —  • 


o  o  o  o  o  o  o 

rvi  \o  o  CO  r^i  *0 

C5  oo  t-*  Lo  »o  r^i 


o  o  c  > 

OJ  vO  '  • 
O  CO  I- 


o  o  o  o 

'  I*  03  vO 

*n  »o 


(A 

e 

00 


r:  a 

<3  C:  </i 

o  rj  -H  G 

>:  ;ir  H 


o 

vO 

K> 

'a 

CO 

o 

I/) 

o 

o 

>o 

o 

o 

o 

‘o 

(N 

,o 

o6 


rs»  O 
l/>  lil  t/3 


M 

^  ^  ^ 


o 

rv| 

tn 

o 

\o 

LO 

o 

o 

VO 

o 

VO 

o 

oo 

o 

o 

C'J 

o 

\o 

r^ 

o 

o 

oo 


00  \0 
l/>  lO 


*«r  o  CO  \o 

iO  to  to  "It 


SSTF  and  FCFS  service  tir.es  SSTF  and  FCFS  wait  tires 

Data  transfer  tine  “  4.18  ns*  Data  transfer  tine  =■  4.1 


104 


of  SSTF  is  to  increase  the  load  at  which  the  disk  system  will 
saturate. 
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AN  OVERVIEW  OF  THE  QSOLVE  SYSTEM 


John  Zahorjan  and  Allan  I.  Levy 


I »  I ntrod uc tion 

QSOLVE  is  a  general  purpose  closed  queueing  network  solver. 
It  is  capable  of  providing  exact  solutions  of  multiple  class 
queueing  networks  with  a  variety  of  service  disciplines  and 
service  time  distributions,  QSOLVE 's  input  consists  of  a 
description  of  a  queueing  network.  From  this  description, 
QSOLVE  automatically  generates  and  solves  the  equilibrium  global 
balance  equations  governina  the  behavior  of  the  system,  QSOLVE' s 
output  consists  of  a  set  of  performance  measurements  of  the  input 
network. 


I I .  Inputs 

•  Customer  Population 

The  .  customer  population  is  described  by  two  variables: 
the  number  of  classes  in  the  model,  and  a  vector  givinu 
the  number  of  customers  in  each  of  the  classes.  There 
is  no  limitation  on  the  number  of  classes  or  number  of 
customers  in  each  class. 

•  Topology 

The  topololgy  of  the  queueing  network  is  given  by  the 
number  of  service  centers  in  the  model,  and  the  routing 
probabilities  between  service  centers  for  each  class  (a 
routing  probability  is  the  probability  that  a  customer 
of  a  particular  class  leaving  center  i  proceeds  directly 
to  center  j)  .  Each  class  of  customer  may  have  different 
routing  probilities,  but  all  customers  in  the  same  class 
must  have  the  same  probabilities.  There  is  no  provision 
for  transitions  involving  a  change  of  customer  class. 


"^This  paper  is  an  internal  SAM  document. 
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•  Service  Centers 


Each  service  center  has  a  service  discipline  and  a 
service  structure.  The  service  discipline  specifies  the 
algorithm  by  which  customers  at  the  service  center  are 
ordered  to  receive  service.  The  service  discipline  at  a 
service  center  is  the  same  for  all  classes.  The  service 
structure  of  a  service  center  is  the  method  of  stages 
representation  of  the  service  time  distribution  at  that 
center  [ Levy77  ].  The  service  structure  may  vary  with 
class,  unless  the  service  discipline  at  the  center  is 
PS,  CQ,  PSLD,  or  CQLD  (see  next  section) . 


n  Service  Disciplines 

All  service  disciplines  are  class  independent  (i 
there  are  no  explicit  priority  disciplines). 


FCFS  -  First-come- first- served .  Customers  are  served 

in  the  order  of  their  arrival. 


LCFS  -  Last-come- f irst-served.  Customers 

in  inverse  order  of  their  arrival. 


are  served 


LCFSPR  -  L ast- come-f irst-served- p reompti ve-resume . 


An 


customer  preempts  any  customer 
in  service.  When  a  customer 

was 


arriving 
current ly 

completes  service,  the  customer  that 
previously  preempted  (if  any)  resumes  execution 
from  the  point  at  which  its  service  was 
suspended. 


NO  -  No  queueing.  An  infinite  number  of  identical 
servers  exist  at  the  service  center,  so  that  no 
queueing  occurs, 

PS  -  Processor  sharing.  Customers  arriving  at  the 
service  center  begin  service  immediately,  and 
each  customer  gets  an  equal  share  of  the  total 
processing  power  of  the  service  center, 

CQ  -  Composite  queueing  [Towsley75].  Composite 
queueing  is  a  service  discipline  which  is 
useful  for  implementing  some  a pproxima tion 
techniques.  Under  CQ,  each  customer  receives 
an  equal  portion  of  the  processor  power 
available  for  that  customer  class  at  that 
center.  The  processor  power  available  to  a 

class  may  depend  on  the  number  of  customers  of 
that  class  at  the  server. 


FCFSLD , LCFSLD, PSLD, CQLD  (Load  dependent  disciplines)  - 
These  are  the  load  dependent  versions  of  the 
corresponding  disciplines  listed  above.  The 
mean  service  time  for  each  customer  class  at  a 
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load  dependent  server  may  vary  with  the  number 
of  customers  of  that  class  at  the  center. 

□  Service  Structure 

There  are  three  service  structures  implemented  in 
QSOLVE.  The  particular  service  structure  chosen  f or  a 
service  center  depends  on  the  coefficient  of  variation 
(standard  deviation  divided  by  the  mean)  of  the 
service  time  distribution  for  that  class  at  that 
server . 

(i)  If  the  coefficient  of  variation  is  greater  than 
one,  a  two  stage  hyperexponential  is  used  to 
represent  the  service  time  distribution.  The 
parameters  of  the  hyper exponent ial  are  chosen 
according  to  [Sauer75]. 

(ii)  If  the  coefficient  of  variation  equals  one,  an 
exponential  service  structure  is  used. 

(iii)  If  the  coefficient  of  variation  is  less  than 
one,  a  generalized  Erlang  service  structure  is 
used  [Levy77]. 

In  the  case  of  load  dependent  service  centers,  a 
single  coefficient  of  variation  (specified  by  the 
user)  determines  the  load  independent  service 
structure. 


III.  Outputs 

The  following  performance  measures  are  computed  and  printed 
by  class  for  each  service  center  in  the  network: 

Utilization  -  the  percentage  of  time  that  any  custoirer  of  a 
given  class  is  in  service  at  the  center 

Throughput  -  The  mean  rate  at  which  customers  of  a  given 
class  leave  the  center 

Mean  queue  length  -  The  average  number  of  customers  of  a 
given  class  at  the  center 

Standard  deviation  of  queue  length 

■Residence  time  -  The  average  amount  of  time  a  customer  of  a 
given  class  spends  at  the  service  center  (wait  time 
plus  service  time) 

Transit  time  -  The  expected  amount  of  time  for  a  customer  of 
a  given  class  leaving  a  center  until  he  returns  to  that 
cent  er 
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Appendix 

Two  versions  of  QSOLVE  currently  exist.  One  solves  the  set 
of  equilibrium  balance  equations  by  Gaussian  elimination,  and  the 
other  by  the  iterative  SOR  technique.  Neither  method  is 
appropriate  for  all  networks.  The  Gaussian  elimination  QSOLVE 
requires  a  great  deal  of  storage  and  computing  time,  but  if 
enough  of  these  is  available  the  solution  will  be  exact  (to  the 
limits  of  roundoff  error) .  The  SOR  verision  of  QSOLVE  requires 
much  less  storage  and  CPU  time,  but  convergence  of  the  iteration 
is  not  guaranteed.  A  modification  of  QSOLVE  is  currently 
underway  which  will  solve  the  simultaneous  equations  using  a 
sparse  matrix  technique.  He  hope  this  will  combine  the 
advantages  of  both  Gaussian  elimination  and  SOR  in  a  practical 
manner. 


QSOLVE  is  implemented  in  PL/I.  The  Gaussian  elimination 
version  exists  as  a  source  file,  and  has  been  compiled  only  under 
the  PL/I  checkout  compiler.  Networks  with  up  to  175  states  have 
been  solved  on  the  University  of  Toronto  Computing  Centre’s 
S/370-165  under  OS  with  HASP  in  approximately  3C0K  bytes  of  main 

of  CPU  time.  The  SOR  version  of 
and  object  files.  The  object  files 
optimizing  compiler  under  full 
OS  format,  but  they  can  easily  be 

with  an  extra  ESD 
on  a  1 M  byte  virtual 


store  and  less 
QSOLVE  exists 
were  created 
optimization . 
adapted  to  CMS 
card.  This 


than  3  minutes 
both  as  source 
by  the  PL/I 
They  are  in 
by  prefacing  the  object  files 
version  of  QSOLVE  has  been  run 


machine  running  under  VM/370-CMS  on  a  S/370-168.  Networks  with 
up  to  1100  states  have  been  solved  in  just  under  90  seconds  of 
CPU  time. 
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ON  THE  RELATIVE  CONTROLLABILITY  OF  MEMORY  POLICIES 
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We  investigate  the  ability  of  memory  management  policies  to 
as  load  controllers  in  a  multiprogrammed,  virtual  memory 
uter  system.  We  consider  in  detail  the  knee  criterion 
rate  a  program  so  that  its  resident  set  size  is  constrained 
average  near  the  knee  of  its  lifetime  function).  The  Working 
(WS)  r  Page  Fault  Frequency  (PFF) ,  and  Least  Recently  Used 
ry  policies  are  used  as  representative  policies  and  are 


red 

as  to  their 

knee 

criterion 

performance . 

The  issue 

of 

ic 

ad  j  ustment 

of 

memory 

policy  parameter 

values  is  a 
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ssed 

.  We  conclude 

t  hat 

WS 

has  several 

per  forma 

nee 

ta  ge 

s  over  PFF. 
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Introduction 

~ ^  ^  — 


One  of  the  earliest  problems  encountered  in  multiprogrammed^ 
virtual  memory  computer  systems  was  that  of  thrashing  -  the 
collapse  of  performance  due  to  an  overcommitment  of  main  memory 
caused  by  operating  at  too  high  a  load  (degree  of 
multiprogramming)  [2],  This  is  illustrated  in  Figure  1,  where 
system  throughput  (job  transactions  completed  per  unit  time)  is 
the  performance  measure.  Operating  at  an  average  load  greater 
than  n2  produces  thrashing. 

The  objective  of  a  load  controller  is  to  regulate  both  the 
load  and  memory  policy  so  that  system  performance  remains  near 
optimal.  (We  choose  system  throughput  as  our  performance  measure 
of  interest  because  it  is  closely  related  to  mean  response  time 
and  processor  utilization  [4].)  Figure  1  shows  that  we  can 
conveniently  define  a  plateau  (n1,n2)  on  the  throughput  curve 
T  (n)  so  that  throughput  is  within  some  specified  tolerance  (e.g. , 
5%)  of  the  optimal  T(nO). 

Perhaps  the  simplest  load  controller  is  one  that  simply 
searches  by  varying  the  load  n  and  observing  T(n)  for  an  optimal 
throughput.  This  method  has  several  limitations:  the  search  may 
find  a  local  maximum,  or  it  may  be  in  error  because  conditions  in 
the  system  may  not  be  accurately  reflected  by  observing  jobs 
departing  from  the  system. 


Denning  et  al.  [4]  have  investigated  a  variety  of  adaptive 
load  control  mechanisms.  They  observed  that  a  practical 
controller  must  avoid  the  high  overhead  of  searching  for  the 
maximum  of  a  control  function.  Instead,  it  should  use 
supplemental  measures  whose  values  indicate  the  most  desirable 
direction  for  an  adjustment  of  load  or  memory  policy  parameter. 
Further,  the  most  useful  supplemental  measures  are  related  to 
program  behaviour  because  load  changes  are  strongly  correlated 
with  changes  in  the  main  memory  allocation  available  to  each 
program. 


They  investigated  three 
of  a  simple  gueueing  network 
were  effective  in  locating 
measures  considered  were: 

the  knee  criterion:  operate 
resident  set  size  average 
function. 


mental  measures  in  the  context 
and  found  that  these  measures 
mal  loads.  The  supplemental 


program  so  that  its  mean 
the  knee  of  its  lifetime 


supple 

model 

opti 


each 
s  near 


the  L=S  criterion :  operate  the  system  so  that  the  system  lifetime 
L  (n)  is  approximately  equal  to  S,  where  S  is  the  page  swap 
time , 

the  50%  criterion:  operate  the  system  so  that  the  paging  device 
utilization  is  approximately  50%. 

The  knee  criterion  was  found  to  be  the  most  robust  control 


policy. 
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Figure  2  The  lifetime  function  curve 


^  Load 


114 


This  paper  continues  the  investigation  of  Denning  ^  a_l.  One 
purpose  is  to  consider  the  knee  criterion  more  fully.  In 
particular,  we  relate  the  knee  criterion  to  prograai  behaviour  and 
memory  management  policies.  The  Least  Recently  Used  (LRU)  [3], 
Working  Set  (WS)  [2],  and  Page  Fault  Frequency  (PFF)  [1] 
replacement  algorithms  are  chosen  as  representative  memory 
policies.  We  examine  whether  the  knee  criterion  is  a  good 
criterion  for  the  LRU,  WS,  and  PFF  policies,  and  how  hard  it  is 
to  operate  near  the  lifetime  knees  for  these  policies.  We  also 
consider  the  issue  of  the  dynamic  adjustment  of  the  memory  policy 
parameter  to  improve  performance.  We  investigate  the  required 
number  of  parameter  adjustments  to  achieve  a  certain  level  of 
performance  for  the  WS  and  PFF  memory  policies.  This  gives  an 
indication  of  the  overhead  present  in  the  respective  controllers. 
We  also  describe  certain  pitfalls  connected  specifically  with 
adjustment  of  the  PFF  parameter. 

The  context  of  our  study  is  also  a  simple  queueing  network 
model,  with  extensive  use  of  address  reference  strings  from 
actual  virtual  memory  programs  to  generate  parameter  values  for 
the  model.  Our  results  corroborate  those  of  Denning  et  al.  and 
also  provide  interesting  empirical  data  on  WS  and  PFF 
performance.  We  begin  with  a  review  of  background  material. 


Background 


We 

function 


mean  virtual  tim 
when  the  program’ 
lifetime  functio 
for  small  x  folio 
local  variations, 
the  reciprocal  of 
The  memory  spac 
memory  policy  is 
the  program  and 
program's  residen 
can  conveniently 
component  and  a  r 
for  example) 

ST(x)  =  K«x  + 


first  consider  performance  measures.  The  lifetime 
L(x)  of  a  program  under  a  given  memory  policy  is  the 
e  between  page  faults  (mean  interfault  interval) 
s  resident  set  averages  x  pages.  Empirical 
ns  usually  have  an  approximately  convex  region 
wed  by  an  approximately  concave  region,  with 
as  shown  in  Figure  2.  The  lifetime  function  is 
the  familiar  £age  fault  rate  function  F (x) . 

product  ST  (x)  of  a  program  under  a  given 
the  product  of  the  amount  of  memory  occupied  by 
the  real  time  spent  occupying  it,  when  the 
t  set  averages  x  pages  in  virtual  time.  ST(x) 
be  expressed  as  the  sum  of 
eal  time  component  (in  units  of 


a  virtual  time 
page-seconds. 


1»1 


D«x  (i) 


reference  string  length,  D  is 
a  page  fault,  P  is  the  number 


where  K  is  the 
delay  to  service 
and  X  (i)  is  the  memory  si2e  during  the  ith  page  fault. 


the 

of 


mean  real  time 
page  faults. 


We  turn  now  to  memory  policies.  The  LRU  policy  is  a  fixed 
partition  policy  with  memory  size  as  its  parameter.  At  any  time, 
the  resident  set  under  LRU  with  memory  size  m  consists  of  those  m 
pages  most  recently  referenced.  The  WS  policy  is  a  variable 
partition  policy  with  window  size  as  its  parameter.  At  time  t, 
the  resident  set  under  WS,  W(t,T),  is  the  set  of  those  pages 
contained  in  a  backward-looking  window  of  size  T,  including  the 
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reference  at  time  t,  r(t),  i.e.,  W  (t, T)  =  £r (t-T+ 1 r (t )} .  The 
PFF  policy  is  also  a  variable  partition  policy  with  window  size 
as  its  parameter.  At  page  fault  time  t,  the  resident  set  under 
PFF  is  determined  by  observing  the  time  t'  of  the  previous  page 
fault  and  comparing  the  interfault  interval  with  a  standard 
parameter  value  THRESH.  The  resident  set  x(t)  for  parameter 
THRESH  is  specified  as 


X  (t) 


^x(f)  +  r(t) 
W (t, t-t ' ) 

w 


t-t*  <  THRESH 
Otherwise 


Intuitively,  PFF  attempts  to  decrease  the  memory  allocation  at 
page  fault  time  if  the  most  recent  interfault  interval  is  "too 
long" . 


Finally,  we  consider  load  control  methods,  both  program  and 
load-driven.  It  is  useful  to  classify  these  methods  based  on  the 
variable  that  is  directly  controlled.  A  program-driven  method 
specifies  a  resident  set  of  guaranteed  content  for  each  program 
and  requires  that  the  load  be  determined  as  the  number  of 
resident  sets  that  can  exist  together  in  main  memory.  Its  free 
variable  is  the  memory  policy  parameter,  such  as  window  size,  and 
the  load  is  a  dependent  variable.  In  contrast,  a  load-driven 
method  specifies  a  load,  requiring  that  the  memory  policy 
determine  a  memory  partition  that  accommodates  the  given  load. 
Its  free  parameter  is  the  load  and  the  memory  partition  is  a 
dependent  variable. 

Additional  background  material  can  be  found  in  [4,7].  Before 
considering  the  knee  criterion  in  more  detail,  we  briefly  discuss 
our  experimental  methodology. 


Exper imental  Methodology 

Six  virtual  memory  programs  were  traced  and  produced  eight 
output  trace  tapes.  Two  criteria  were  used  to  select  programs 
for  tracing.  First,  the  programs  represented  a  range  of  program 
behaviours.  Second,  the  programs  were  heavily-used  and  were  a 
substantial  load  on  the  system.  Care  was  taken  to  ensure  that 
the  reference  string  length  used  in  the  experiments  was 
sufficient  to  exhibit  consistent  experimental  results.  Lifetime 
functions  and  space-time  products  were  measured  for  several  fixed 
partition  and  variable  partition  policies.  Further  details  can 
be  found  in  [7], 


The  Knee 


Criterion 


A  knee  of  a  lifetime  function  curve  is  the  operating  point 
beyond  which  the  curve  tends  to  flatten  out.  The  primary  knee  is 
defined  geometrically  as  the  point  of  tangency  between  the  curve 
and  the  ray  of  maximum  slope,  from  the  origin,  which  is 
tangential  to  the  curve.  Knees  of  higher  order  can  be  defined 
similarly  in  terms  of  rays  of  smaller  slopes.  As  a  load  control 
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rule,  the  knee  criterion  constrains  the  mean  resident  set  size  of 
a  program  to  average  near  the  primary  knee  of  the  program's 
lifetime  function.  The  knee  criterion  is  suited  for  a  program- 
driven  control  method  because  it  uses  detailed  information  about 
the  dynamic  behaviour  of  individual  programs. 


The  knee  criterion  is  an  intuitively  appealing  rule,  because 
the  primary  knee  represents  the  "point  of  diminishing  returns"  on 
the  lifetime  function.  However,  there  is  much  more  to  it  than 
intuitive  appeal.  The  primary  knee  maximizes  the  ratio  L(x)/x. 
Suppose  one  page  fault  incurs  a  mean  execution  delay  of  D 
(corresponding  to  page  swap  time  and  time  spent  gueueing).  P 
page  faults  in  a  program  will  span  a  real  time  interval  whose 
expected  length  is  P«L(x)  +  P«D.  The  memory  space-time  product 
per  reference  is  then 

X • JP •Lfx) +P«D1  =  X  +  D*  x 
P*L(x)  L(x) 


Thus,  operat 
memory  space- 
speeds  and 
space-time  pe 


ing  at  the  primary 
time  due  to  paging  ( 
request  rates  are 
r  job  tends  to  be  mi 


knee  minimi 
(D«x)/L  (x)  )  . 
independent 
nimized  [4], 


zes  the  component  of 
Because  I/O  device 
of  X,  total  memory 


Consider  now  a  complementary  argument.  Observe  a  system  with 
load  n  for  V  time  units.  The  total  system  space-time  product  in 
the  system  is  M«V  (for  main  memory  capacity  M),  and  the  total 
number  of  job  completions  is  V«T(n).  The  system  space-time  per 
job  is  then  (M« V)  /  (T  (n)  •?)  ,  or  M/T  (n)  .  Because  Smith  has 
observed  that  memory  space-time  calculated  in  a  uniprogramming 
mode  is  related  to  system  space-time  measured  in  a 
multiprogramming  load  [10],  minimizing  system  space-time  (and 
hence  memory  space-time)  is  equivalent  to  maximizing  throughput. 


We  now  have  the  chain  of  arguments:  the  primary  knee  tends  to 
minimize  memory  space-time  per  job,  and  memory  space-time  per  job 
is  minimized  exactly  when  throughput  is  maximized.  The 
implication  therefore  is  that  the  knee  criterion  seems  to  define 
a  load  at  which  throughput  is  optimal.  The  foregoing  is  not  a 
proof;  rather,  it  is  a  plausibility  argument  supporting  the  knee 
criterion.  Denning  e^  al.  found  the  knee  criterion  to  be  the 
most  robust  of  the  three  control  methods  investigated  [4].  It 
consistently  produced  near  optimal  throughputs  under  a  variety  of 
operating  conditions.  The  purpose  here  is  to  draw  conclusions 
about  the  interaction  between  the  knee  criterion  and  the  memory 
policies  producing  the  lifetime  functions. 


Experiments 

We  first  tested  the  correspondence  between  the  lifetime  knees 
and  the  space-time  minima  for  the  WS,  PFF,  and  LRU  policies.  The 
correspondence  was  in  terms  of  the  relative  percentage  difference 
between  the  lifetime  knee  memory  space-time  value  and  the  minimum 
memory  space-time  value.  A  summary  of  the  data  for  the  eight 
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trace  tapes  is  displayed  in  Table  1.  WS  had  both  the  lowest  mean 
and  lowest  maximum  relative  percentage  difference  between  the 
space-time  values  of  the  lifetime  knee  operating  points  and  the 
minima  of  the  space-time  product.  PFF  ranked  next,  LBU  last. 
The  table  shows  that  the  correspondence  between  knees  and  local 
space-time  minima  is  extremely  strong  for  the  WS  policy  and  is 
also  present,  though  not  as  strongly,  for  the  PFF  and  LRU 
policies.  It  is  important  to  note  that  PFF  did  produce  errors 
exceeding  10*^  in  the  correspondence  for  two  knees  on  one 
reference  string;  we  observed  no  such  erratic  behaviour  for  WS. 
LRU  showed  the  weakest  correspondence.  LRU  also  has  the 
troublesome  feature  that  an  adaptive  control  method  cannot 
systematically  locate  the  LRU  space-time  minimum  in  a  simple  way 
because  the  LRU  space-time  curve  typically  has  a  sharp  minimum 
[1,7].  We  show  later  that  WS  and  PFF  appear  to  be  mor«= 
controllable  in  that  the  knees  and  minima  can  be  found  by 
indirect  methods. 
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nvestigate  further  the  plausibility  argument,  we  tested 
spondence  between  lifetime  knee  and  optimal  throughput 
mple  queueing  network  model  [3,7].  The  correspondence 
in  terms  of  the  relative  percentage  difference  between 
time  knee  throughput  and  the  optimal  throughput.  Two 
page  swap  time  D  and  two  plateaux  of  good 
within  5%  and  10%  of  optimal  throughput  -  were 
et  al.  had  observed  that  their  results  depended 
exceeded  the  knee  lifetime  by  a  significant  amount 
also  observed  this  dependence,  A  summary  of  the  data 
eight  trace  tapes  is  presented  in  Table  2.  The 
dence  between  the  lifetime  knee  and  optimal  throughput 
points  (i.e.,  the  knee  criterion)  for  the  small  D  value 
good  for  all  policies;  at  the  large  D  value,  where  the 
terion  was  failing,  PFF  showed  a  better  corresponderc*^ 
with  LRU  showing  poor  correspondence. 


mean 

ce 

nning 
her  D 


The  table  shows  that  for 
criterion  began  to  break  down  whe 
larger  than  the  knee  lifetime, 
the  system  delay  in  responding  to 
LT  is  the  lifetime  at  the  pri 
long  compared  to  the  periods  of  c 
program  and  the  load  control 
changes  in  locality  in  a  timely  m 
WS  in  this  case  because  its 
during  locality  changes  gives  i 
isolates  it  from  this  problem. 


each  memory  policy  the  knee 
n  the  value  of  D  became  much 
The  reason  is  that  D  represents 
a  change  of  locality  sets.  If 
mary  knee  and  D  >>  LT,  then  D  is 
onstant  memory  demand  by  the 
policy  can  no  longer  adapt  to 
anner  [7],  PFF  does  better  than 
tendency  to  allocate  more  memory 
t  a  larger  resident  set  and 
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.  Two  results  foil 
because  it  operate 


related  question  of  the  width  of  the 
plateaux  allowed  by  WS  and  PFF  was 
range  is  determined  by  the  smallest 
the  model  to  operate  on  a  plateau, 
r  combinations  of  conditions  were 
3,  WS  allowed  a  wider  load  range  in 
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WS 

PFF 

LRU 

Number  of  knees- 
primary,  secondary, 
terti ary 

8,  6,  2 

8,  5,  1 

8,  0,  0 

Mean  relative  per¬ 
centage  difference 
between  lifetime 
knee  ST  value  and 

ST  minimum 

.35,  .37,  .95 

1.96,  3.48,  .90 

3.40,  -,  - 

Maximum  relative 

percentage 

difference 

1.64,  .97,  1.45 

11.46,  10.28,  .90 

11.70,  -,  - 

Table  1  Summary  of  lifetime  knee  -  minimum 
space-time  correspondence  data 
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D=5 

D=10 

mean  relative  per- 

number 

of 

mean  relative  per- 

number  of 

centage  difference 

programs  with- 

centage  difference 

programs  with- 

between  knee  T(n) 

in  specified 

between  knee  T(n) 

in  specified 

and  max  T(n) 

contro' 

and  max  T(n) 

control 

5% 

10% 

5% 

10% 

ws 

5.7 

6 

6 

14.9 

3 

4 

PFF 

4.9 

5 

6 

12.5 

5 

5 

LRU 

12.3 

4 

6 

25.6 

1 

2 

Table  2  Summary  of  knee  criterion  data 
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D=5 

D=10 

5% 

10% 

5% 

10% 

WS 

4.9 

5.8 

2.5 

4.8 

PFF 

3.8 

5.0 

2.6 

3.9 

Table  3  Load  ranges  for  WS  and  PFF  on  the  throughput 
plateaux 
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of  operating  conditions.  Also,  PFF  performs  better  than  WS  when 
the  mean  page  swap  time  is  significantly  larger  than  the  knee 
lifetime,  as  it  did  for  the  lifetime  knee  -  system  throughput 
correspondence  test.  This  latter  case,  however,  does  not 
represent  a  desirable  situation  for  system  operation. 

To  summarize  the  results  of  this  part  of  the  investigation: 
the  WS  knee  criterion  worked  well,  as  expected.  The  PFF  and  LRU 
knee  criteria  also  showed  good  correspondence.  The  PFF  policy, 
however,  exhibited  erratic  behaviour  on  one  reference  string  when 
its  lifetime  Tcnees  produced  space-time  values  differing  by  more 
than  from  the  space-time  minima  values.  In  the  queueing 
network  experiments,  WS  allowed  a  wider  load  range  for  two 
plateaux  on  the  throughput  curve  than  PFF.  The  WS  and  PFF  knee 
criteria  were  comparable  when  the  average  relative  difference 
between  lifetime  knee  throughput  and  optimal  throughput  was 
considered.  The  value  of  the  mean  page  swap  time  D  affected  all 
the  results.  Increasing  the  value  of  D  well  beyond  the  knee 
lifetime  ruined  the  knee  criterion.  Further  comparisons  of 
memory  policies  are  given  in  the  next  section. 


Dynamic  Adjustment  of  Memory  Policy  Parameters 
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oad  controllers  specify  a  program's  resident 
by  tracking  its  reference  string.  In  doing 
anges  in  locality  sets  and  specify  a  memory 
1  times  is  an  estimate  of  the  current 
variable  partition  policies  WS  and  PFF  use  a 
onstant  size  to  detect  locality  set  changes 
for  WS  and  the  thresh  old  window  size  THRESH 
ling  to  consider  changing  parameter  values, 
age  referencing,  to  produce  better  estimates 
en  the  locality  set  is  small,  for  example, 
T  can  be  made  smaller  to  estimate  a  smaller 


Several  studies  have  pursue 
to  reduce  the  window  size  at 
removing  unreferenced  pages  of 
increases  in  working  set  size  [1C 
fixed  window  size  to  each  page  of 
replacement  policy  and  showed  tha 
the  same  mean  memory  size  over  WS 
two  methods  is  that  the  impr 
significant  enough  to  justify 
modifying  proposals  are  based  on 
window  size  during  program  exec 
performance.  However,  it  is  not 
control  is  of  any  use,  Chu  and  0 
in  their  WS  and  PFF  space-time  cu 
of  the  window  parameter  value  was 
their  data,  it  sufficed  for  each 
fixed  parameter  setting. 


d  this  approach.  Smith  proposed 
locality  set  changes,  thereby 
the  old  locality  set  and  damping 
].  Prieve  assigned  a  different, 
a  program  in  his  Page  Partition 
t  it  improved  page  fault  rate  at 
[8].  The  difficulty  with  these 
ovements  in  general  were  not 
the  cost  of  implementation.  T- 
assumptions  that  controlling  the 
ution  is  beneficial  and  improves 
clear  whether  dynamic  window 
pderbeck  observed  a  wide  plateau 
rves,  indicating  that  the  choice 
not  critical  [1].  According  to 
program  to  be  assigned  its  own 
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the  space-time  product  and 
size  intervals  associated  with 
space-time  curve  exhibited 
throughput  curve. 


A  note  of  caution:  our  results  may  depend  on  th 
programs  present  in  our  study.  If  we  discovered  that  a  w 
a  specific  size  gave  (say)  a  10%  level  of  control  fo 
would  not  necessarily  expect  that  window  size  value  to  be 
for  a  different  set  of  programs.  We  are  not  making 
conclusions  about  good  choices  of  window  sizes;  instead, 
presenting  a  methodolog.y  for  assessing  the  overhead  o 
policy  controllers. 
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Our  interest  was  in  the  interval (s) 
to  force  each  of  the  eight  reference  stri 
some  specified  level  of  the  memory  pol 
cost.  (As  a  result  of  the  previous  secti 
to  the  guestion:  how  hard  is  it  to  find 
5%  and  a  10%  plateau  as  illustrated  in  F 
horizontal  lines  in  these  figures  repre 
sizes  causing  the  program  to  operate  on 
Figure  3,  the  vertical  line  A  shows  sett 
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time  plateau  (although  several  would  ope 
P2,P4,P7) .  Thus,  one  value  of  T  would  be 
a  10%  level  of  control.  Were  a  5%  level 
WS,  at  least  two  values  of  T,  shown 
31(T=50,000)  and  B2  (T=118,000)  would  be 
necessary  to  change  constantly  between  th 
program  execution;  only  an  initial  sele 
be  needed.  The  results  for  PFF  were  diff 
at  least  three  threshold  window  size 
control  for  these  reference  strings.  At 
need  at  least  four  threshold  window  sizes 
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These  results  suggest  that  PFF  is  inherently  more  difficult 
to  control  than  WS  because  it  requires  more  distinct  parameter 
values  to  achieve  a  comparable  level  of  performance  over  a  set  of 
reference  strings.  A  well-designed  PFF  controller  therefore 
would  likely  generate  more  overhead  than  a  well-designed  WS 
controller.  This,  in  turn,  would  offset  the  benefits  of  PFF's 
simpler  implementation  [1]. 


There  are  also  stability  problems  present  in  a  PFF 
controller.  We  have  observed  both  anomalous  and  gap  behaviours 
in  PFF  reference  strings  [7].  Anomalous  behaviour  is  present 
when  changes  in  parameter  values  do  not  produce  the  desired 
performance  improvements.  For  example,  increasing  the  PFF 
threshold  window  size  may  unexpectedly  lead  to  a  smaller  mean 
memory  allocation  or  a  higher  page  fault  rate,  or  both  (see 
Figure  5  and  [6]).  No  such  behaviour  is  possible  for  the  WS 
window  size  [5].  Gap  behavi our  is  present  when  a  small  increase 
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Figure  3  WS  window  sizes  for  5  percent  and  10  percent  plateaux 
on  the  space-time  curve 
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Figure  4  PFF  threshold  window  sizes  for  5  percent  and  10  percent 
plateaux  on  the  space-time  curve 
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Figure  6  A  lifetime  function  exhibiting  gap  behaviour 
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in  threshold  window  value  produces  a  small  increase  in  lifetime 
value,  but  a  large  jump  in  mean  resident  set  size  (see  Figure  6). 
In  other  words,  a  small  upward  adjustment  in  parameter  value  may 
cause  some  programs  to  place  a  sudden  heavy  demand  on  the  memory 
subsystem.  No  such  behaviour  was  observed  for  WS. 

To  summarize  the  results  of  this  part  of  our  investigation: 
a  WS  load  controller  contains  less  overhead  than  a  PFF 
controller;  it  consistently  requires  fewer  parameter  settings  for 
the  set  of  programs  to  achieve  both  5%  and  plateaux  in  the 
space-time  curves.  Moreover,  a  WS  controller  is  more  stable  than 
a  PFF  controller  because  it  is  not  subject  to  anomalous  or  gap 
behaviours . 


Concl usions 

Gap  and  anomalous  behaviours  are  certainly  vexing  problems 
for  a  practical  PFF  load  controller.  We  have  observed  both  in 
practice.  However,  it  is  probable  that  over  large  increases  in 
PFF  threshold  window  size  there  is  little  practical  likelihood  of 
observing  control  problems  relating  to  the  lifetime  function 
during  computer  system  operation. 

What  is  more  troublesome  is  the  observation  that,  for  the  set 
of  eight  trace  tapes,  there  did  not  exist  a  single  PFF  threshold 
window  size  which  would  give  a  10%  level  of  control,  as  there  was 
for  WS.  A  PFF  controller  must  classify  a  program  and  select  its 
proper  parameter  setting,  involving  overhead  not  present  for  a  WS 
controller  at  the  10%  level.  Given  the  knee  criterion  and 
dynamic  parameter  adjustment  results,  we  conclude  that  the 
apparent  ease  of  PFF  implementation  should  be  balanced  against 
the  performance  benefits  of  WS  when  considering  the  design  or 
modification  of  a  memory  policy. 
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